├── CHANGELOG ├── README.md ├── alias.lisp ├── cl-interpol.asd ├── docs └── index.html ├── packages.lisp ├── read.lisp ├── specials.lisp ├── test ├── create_perl_tests.pl ├── packages.lisp ├── perltests ├── simple └── tests.lisp └── util.lisp /CHANGELOG: -------------------------------------------------------------------------------- 1 | Version 0.2.7 2 | 2020-07-02 3 | Added support for name-readtables. 4 | Now interpol syntax can be activated using: (named-readtables:in-readtable :interpol-syntax). 5 | This way integration with SLIME and SLY is possible. 6 | 7 | Version 0.2.6 8 | 2016-08-26 9 | Merge pull request #5 from agrostis/master (Hans Hübner) 10 | Followup to 100efc6: added documentation and tests. (Boris Smilga) 11 | Followup to 100efc6: ignore non-newline whitespaces only, as per CLHS 22.3.9.3. (Boris Smilga) 12 | Merge pull request #4 from agrostis/master (Hans Hübner) 13 | Backspace + newline as an escape sequence emulating tilde + newline in CL:FORMAT. (Boris Smilga) 14 | 15 | Version 0.2.5 16 | 2015-11-22 17 | Fix docstring as suggested by Stas (Hans Hübner) 18 | 19 | Version 0.2.4 20 | 2015-11-22 21 | Exnabled direct use of cl-interpol-reader (Mike Maul) 22 | 23 | Version 0.2.3 24 | 2014-11-28 25 | update support info (Hans Huebner) 26 | 27 | Version 0.2.2 28 | 2013-10-04 29 | inline format directive syntax and an extra argument to enable-interpol-syntax (Marco Baringer) 30 | 31 | Version 0.2.1 32 | 2008-07-25 33 | Fixed typo in HTML documentation 34 | 35 | Version 0.2.0 36 | 2008-07-24 37 | Base Unicode support on CL-UNICODE 38 | Add new CL-PPCRE special characters for named registers and named properties 39 | Re-architecture test suite 40 | 41 | Version 0.1.2 42 | 2004-12-16 43 | Added hyperdoc support 44 | Added :CL-INTERPOL to *FEATURES* 45 | Typo fixes in doc/index.html 46 | 47 | Version 0.1.1 48 | 2003-12-21 49 | Fixed an embarrassing bug where COLLECTOR was re-used in read.lisp (reported by Hans Hübner) 50 | More tests, better failure reporting 51 | Tried to increase readability of docs 52 | Mentioned Debian and Gentoo in docs 53 | 54 | Version 0.1.0 55 | 2003-10-22 56 | Initial release 57 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | -------------------------------------------------- 2 | CL-INTERPOL - String interpolation for Common Lisp 3 | -------------------------------------------------- 4 | 5 | CL-INTERPOL is a library for Common Lisp which modifies the reader so 6 | that you can have interpolation within strings similar to Perl or Unix Shell 7 | scripts. It also provides various ways to insert arbitrary characters 8 | into literal strings even if your editor/IDE doesn't support them. 9 | Here's an example: 10 | 11 | ~~~lisp 12 | (named-readtables:in-readtable :interpol-syntax) 13 | 14 | 15 | (let ((a 42)) 16 | #?"foo: \xC4\N{Latin capital letter U with diaeresis}\nbar: ${a}") 17 | "foo: ÄÜ 18 | bar: 42" 19 | ~~~ 20 | 21 | CL-INTERPOL comes with a [BSD-style 22 | license](http://www.opensource.org/licenses/bsd-license.php) so you 23 | can basically do with it whatever you want. 24 | 25 | Complete documentation for CL-INTERPOL can be found in the `docs` 26 | directory or at [the project documentation 27 | site](http://edicl.github.io/cl-interpol/). 28 | -------------------------------------------------------------------------------- /alias.lisp: -------------------------------------------------------------------------------- 1 | ;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-INTERPOL; Base: 10 -*- 2 | ;;; $Header: /usr/local/cvsrep/cl-interpol/alias.lisp,v 1.3 2008/07/23 14:41:37 edi Exp $ 3 | 4 | ;;; Copyright (c) 2003-2008, Dr. Edmund Weitz. All rights reserved. 5 | 6 | ;;; Redistribution and use in source and binary forms, with or without 7 | ;;; modification, are permitted provided that the following conditions 8 | ;;; are met: 9 | 10 | ;;; * Redistributions of source code must retain the above copyright 11 | ;;; notice, this list of conditions and the following disclaimer. 12 | 13 | ;;; * Redistributions in binary form must reproduce the above 14 | ;;; copyright notice, this list of conditions and the following 15 | ;;; disclaimer in the documentation and/or other materials 16 | ;;; provided with the distribution. 17 | 18 | ;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED 19 | ;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 20 | ;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 21 | ;;; ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY 22 | ;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 23 | ;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE 24 | ;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 25 | ;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 26 | ;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 27 | ;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 28 | ;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 29 | 30 | (in-package :cl-interpol) 31 | 32 | ;;; define some aliases 33 | (loop for (alias . name) in '(("LINE FEED" . "LINE FEED \(LF)") 34 | ("FORM FEED" . "FORM FEED \(FF)") 35 | ("CARRIAGE RETURN" . "CARRIAGE RETURN \(CR)") 36 | ("NEXT LINE" . "NEXT LINE \(NEL)") 37 | ("LF" . "LINE FEED \(LF)") 38 | ("FF" . "FORM FEED \(FF)") 39 | ("CR" . "CARRIAGE RETURN \(CR)") 40 | ("NEL" . "NEXT LINE \(NEL)") 41 | ("ZWNJ" . "ZERO WIDTH NON-JOINER") 42 | ("ZWJ" . "ZERO WIDTH JOINER") 43 | ("BYTE ORDER MARK" . "ZERO WIDTH NO-BREAK SPACE") 44 | ("BOM" . "BYTE ORDER MARK") 45 | ("HORIZONTAL TABULATION" . "CHARACTER TABULATION") 46 | ("VERTICAL TABULATION" . "LINE TABULATION") 47 | ("FILE SEPARATOR" . "INFORMATION SEPARATOR FOUR") 48 | ("GROUP SEPARATOR" . "INFORMATION SEPARATOR THREE") 49 | ("RECORD SEPARATOR" . "INFORMATION SEPARATOR TWO") 50 | ("UNIT SEPARATOR" . "INFORMATION SEPARATOR ONE") 51 | ("PARTIAL LINE DOWN" . "PARTIAL LINE FORWARD") 52 | ("PARTIAL LINE UP" . "PARTIAL LINE BACKWARD")) 53 | for existing-char = (character-named name) 54 | when existing-char 55 | do (setf (gethash (canonicalize-name alias) *unicode-aliases*) existing-char)) 56 | -------------------------------------------------------------------------------- /cl-interpol.asd: -------------------------------------------------------------------------------- 1 | ;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-USER; Base: 10 -*- 2 | ;;; $Header: /usr/local/cvsrep/cl-interpol/cl-interpol.asd,v 1.11 2008/07/25 12:51:58 edi Exp $ 3 | 4 | ;;; Copyright (c) 2003-2008, Dr. Edmund Weitz. All rights reserved. 5 | 6 | ;;; Redistribution and use in source and binary forms, with or without 7 | ;;; modification, are permitted provided that the following conditions 8 | ;;; are met: 9 | 10 | ;;; * Redistributions of source code must retain the above copyright 11 | ;;; notice, this list of conditions and the following disclaimer. 12 | 13 | ;;; * Redistributions in binary form must reproduce the above 14 | ;;; copyright notice, this list of conditions and the following 15 | ;;; disclaimer in the documentation and/or other materials 16 | ;;; provided with the distribution. 17 | 18 | ;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED 19 | ;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 20 | ;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 21 | ;;; ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY 22 | ;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 23 | ;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE 24 | ;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 25 | ;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 26 | ;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 27 | ;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 28 | ;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 29 | 30 | (defsystem "cl-interpol" 31 | :version "0.2.7" 32 | :license "BSD-2-Clause" 33 | :serial t 34 | :depends-on ("cl-unicode" 35 | "named-readtables") 36 | :components ((:file "packages") 37 | (:file "specials") 38 | (:file "util") 39 | (:file "alias") 40 | (:file "read")) 41 | :in-order-to ((test-op (test-op "cl-interpol/test")))) 42 | 43 | (defsystem "cl-interpol/test" 44 | :depends-on ("cl-interpol" 45 | "flexi-streams") 46 | :components ((:module "test" 47 | :serial t 48 | :components ((:file "packages") 49 | (:file "tests")))) 50 | :perform (test-op (o c) 51 | (symbol-call :cl-interpol-test :run-all-tests))) 52 | -------------------------------------------------------------------------------- /docs/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | CL-INTERPOL - String interpolation for Common Lisp 6 | 24 | 25 | 26 | 27 | 28 | 29 |

CL-INTERPOL - String interpolation for Common Lisp

30 | 31 |
 
32 | "The crux of the biscuit is the apostrophe." (Frank Zappa) 33 | 34 |
35 |
 

Abstract

36 | 37 | CL-INTERPOL is a library for Common Lisp which modifies the reader so 38 | that you can have interpolation within strings similar to Perl or Unix Shell 39 | scripts. It also provides various ways to insert arbitrary characters 40 | into literal strings even if your editor/IDE doesn't support them. 41 | Here's an example: 42 |
 43 | * (ql:quickload :cl-interpol)
 44 | * (named-readtables:in-readtable :interpol-syntax)
 45 | * (let ((a 42))
 46 |     #?"foo: \xC4\N{Latin capital letter U with diaeresis}\nbar: ${a}")
 47 | "foo: ÄÜ
 48 | bar: 42"
 49 | 
50 | If you're looking for an alternative syntax for characters, see 51 | CL-UNICODE. 52 |

53 | CL-INTERPOL comes with a BSD-style 55 | license so you can basically do with it whatever you want. 56 | 57 |

58 | Download current version or visit the project on Github. 59 |

60 | 61 |
 

Contents

62 |
    63 |
  1. Download and installation 64 |
  2. Support 65 |
  3. Syntax 66 |
      67 |
    1. Backslashes 68 |
    2. Interpolation 69 |
    3. Support for CL-PPCRE/Perl regular expressions 70 |
    71 |
  4. The CL-INTERPOL dictionary 72 |
      73 |
    1. enable-interpol-syntax 74 |
    2. disable-interpol-syntax 75 |
    3. *list-delimiter* 76 |
    4. *outer-delimiters* 77 |
    5. *inner-delimiters* 78 |
    6. *interpolate-format-directives* 79 |
    7. *regex-delimiters* 80 |
    81 |
  5. Known issues 82 |
      83 |
    1. {n,m} modifiers in extended mode 84 |
    85 |
  6. Acknowledgements 86 |
87 | 88 |
 

Download and installation

89 | 90 | CL-INTERPOL together with this documentation can be downloaded from Github. The 92 | current version is 0.2.7. 93 |

94 | CL-INTERPOL comes with a system definition for ASDF so you can install the library with 96 |

 97 | (asdf:load-system :cl-interpol)
 98 | 
99 | if you've unpacked it in a place where ASDF can find it. It depends on CL-UNICODE and NAMED-READTABLES. Installation 100 | via asdf-install or 101 | Quicklisp 102 | should also be possible. 103 |

104 | Note: Before you can actually use the new reader 105 | syntax you have to enable it with 106 | ENABLE-INTERPOL-SYNTAX 107 | or via named-readtables: 108 |

(named-readtables:in-readtable :interpol-syntax)
109 | 110 |

111 | You can run a test suite which tests most aspects of the library with 112 |

113 | (asdf:test-system :cl-interpol)
114 | 
115 | The test suite depends on FLEXI-STREAMS. 116 |

117 | The development version of cl-interpol can be found on 118 | github. Please use the github issue tracking system to 119 | submit bug reports. Patches are welcome, please use GitHub pull 120 | requests. 121 | 122 |
 

Syntax

123 | 124 | CL-INTERPOL installs ? (question mark) as a 125 | "sub-character" of the dispatching 127 | macro character # (sharpsign), i.e. it relies on 128 | the fact that sharpsign is a dispatching macro character in the current 130 | readtable when ENABLE-INTERPOL-SYNTAX 132 | is invoked. 133 |

134 | The question mark may optionally be followed by an R and 135 | an X (case doesn't matter) - see the 136 | section about regular expression syntax below. If both of them are 137 | present, the R must precede the X. 138 |

139 | The next character is the opening outer delimiter which may 140 | be one of " (double quote), ' 141 | (apostrophe), | (vertical bar), # 142 | (sharpsign), / (slash), ( (left 143 | parenthesis), < (less than), [ (left 144 | square bracket), or { (left curly bracket). (But see *OUTER-DELIMITERS*.) 145 |

146 | The following characters comprise the string which is read until the 147 | closing outer delimiter is seen. The closing outer delimiter 148 | is the same character as the opening outer delimiter - unless the 149 | opening delimiter was one of the last four described below in which 150 | case the closing outer delimiter is the corresponding closing (right) 151 | bracketing character. So these are all valid CL-INTERPOL string 152 | equivalent to "abc": 153 | 154 |

155 | * #?"abc"
156 | "abc"
157 | * #?r"abc"
158 | "abc"
159 | * #?x"abc"
160 | "abc"
161 | * #?rx"abc"
162 | "abc"
163 | * #?'abc'
164 | "abc"
165 | * #?|abc|
166 | "abc"
167 | * #?#abc#
168 | "abc"
169 | * #?/abc/
170 | "abc"
171 | * #?(abc)
172 | "abc"
173 | * #?[abc]
174 | "abc"
175 | * #?{abc}
176 | "abc"
177 | * #?<abc>
178 | "abc"
179 | 
180 | 181 | A character which would otherwise be a closing outer delimiter can be 182 | escaped by a backslash immediately preceding it (unless this backslash 183 | is itself escaped by another backslash). Also, the bracketing 184 | delimiters can nest, i.e. a right bracketing character which might 185 | otherwise be closing outer delimiter will be read as part of the 186 | string if it is matched by a preceding left bracketing character 187 | within the string. 188 |
189 | * #?"abc"
190 | "abc"
191 | * #?"abc\""
192 | "abc\""
193 | * #?"abc\\"
194 | "abc\\"
195 | * #?[abc]
196 | "abc"
197 | * #?[a[b]c]
198 | "a[b]c"
199 | * #?[a[[b]]c]
200 | "a[[b]]c"
201 | * #?[a[[][]]b]
202 | "a[[][]]b"
203 | 
204 | 205 | The characters between the outer delimiters are read one by one and 206 | inserted into the resulting string as is unless one of the special 207 | characters \ (backslash), $ (dollar sign), 208 | or @ (at-sign) is encountered. The behaviour with respect 209 | to these special characters is modeled after Perl because CL-INTERPOL 210 | is intended to be usable with CL-PPCRE. 212 | 213 |

Backslashes

Here's a short 214 | summary of what might occur after a backslash, originally copied 215 | from man perlop. Details below - you can 216 | click on the entries in this table to go to the corresponding paragraph. 217 | 218 |
219 |   \t          tab             (HT, TAB)
220 |   \n          newline         (NL)
221 |   \r          return          (CR)
222 |   \f          form feed       (FF)
223 |   \b          backspace       (BS)
224 |   \a          alarm (bell)    (BEL)
225 |   \e          escape          (ESC)
226 |   \033        octal char      (ESC)
227 |   \x1b        hex char        (ESC)
228 |   \x{263a}    wide hex char   (SMILEY)
229 |   \c[         control char    (ESC)
230 |   \N{name}    named char
231 | 
232 |   \l          lowercase next char
233 |   \u          uppercase next char
234 |   \L          lowercase till \E
235 |   \U          uppercase till \E
236 |   \E          end case modification
237 |   \Q          quote non-word characters till \E
238 | 
239 |   \          ignore the newline and following whitespaces
240 | 
241 |

242 | If a backslash is followed by 243 | n, r, f, b, 244 | a, or e (all lowercase) then the corresponding character 245 | #\Newline, #\Return, #\Page, 246 | #\Backspace, (CODE-CHAR 7), or 247 | (CODE-CHAR 27) is inserted into the string. 248 |

249 | * #?"New\nline"
250 | "New
251 | line"
252 | 
253 |

254 | If a backslash is followed by one of 255 | the digits 0 to 9, then this digit and 256 | the following characters are read and parsed as octal digits and will 257 | be interpreted as the character code of the character to insert 258 | instead of this sequence. The sequence ends with the first character 259 | which is not an octal digit but at most three digits will be 260 | read. Only the rightmost eight bits of the resulting number will be 261 | used for the character code. 262 | 263 |

264 | * #?"\40\040"
265 | "  "  ;; two spaces
266 | * (map 'list #'char-code #?"\0\377\777")
267 | (0 255 255)  ;; note that \377 and \777 yield the same result
268 | * #?"Only\0403 digits!"
269 | "Only 3 digits!"
270 | * (map 'list #'identity #?"\9")
271 | (#\9)
272 | 
273 |

274 | If a backslash is followed by an x (lowercase) the 275 | following characters are read and parsed as hexadecimal digits and 276 | will be interpreted as the character code of the character to insert 277 | instead of this sequence. The sequence of hexadecimal digits ends with 278 | the first character which is not one of the characters 0 279 | to 9, a to f, or A 280 | to F but at most two digits will be read. If the 281 | character immediately following the x is a { 282 | (left curly bracket), then all the following characters up to a 283 | } (right curly bracket) must be hexadecimal digits and 284 | comprise a number which'll be taken as the character code (and which 285 | obviously should denote a character known by your Lisp 286 | implementation). Note that in both case it is legal that zero digits 287 | will be read which'll be interpreted as the character code 288 | 0. 289 |

290 | * (char #?"\x20" 0)
291 | #\Space
292 | * (char-code (char #?"\x" 0))
293 | 0
294 | * (char-code (char #?"\x{}" 0))
295 | 0
296 | * (unicode-name (char #?"\x{2323}" 0))
297 | "SMILE"
298 | * #?"Only\x202 digits!"
299 | "Only 2 digits!"
300 | 
301 |

302 | If a backslash is followed by a 303 | c (lowercase) then the ASCII control 305 | code of the following character is inserted into the string. Note 306 | that this only defined for A to Z, 307 | [, \, ], ^, and 308 | _ although CL-INTERPOL will also accept other 309 | characters. In fact, the transformation is implemented as 310 |

311 | (code-char (logxor #x40 (char-code (char-upcase <char>))))
312 | 
313 | where <char> is the character following \c. 314 |
315 | * (char-name (char #?"\cH" 0))
316 | ;; see 13.1.7 of the ANSI standard, though
317 | "Backspace"
318 | * (char= (char #?"\cj" 0) #\Newline)
319 | T
320 | 
321 | 322 |

323 | If a backslash is followed by an 324 | N (uppercase) the following character must be a 325 | { (left curly bracket). The characters 326 | following the bracket are read until a } 327 | (right curly bracket) is seen and comprise the Unicode name 328 | of the character to be inserted into the string. This name is 329 | interpreted as a Unicode character name 330 | by CL-UNICODE and returns 331 | the 332 | character CHARACTER-NAMED. 333 | This obviously also means that you can fine-tune this behaviour using 334 | CL-UNICODE's global special variables. 335 |

336 | * (unicode-name (char #?"\N{Greek capital letter Sigma}" 0))
337 | "GREEK CAPITAL LETTER SIGMA"
338 | * (unicode-name (char #?"\N{GREEK CAPITAL LETTER SIGMA}" 0))
339 | "GREEK CAPITAL LETTER SIGMA"
340 | * (setq *try-abbreviations-p* t)
341 | T
342 | * (unicode-name (char #?"\N{Greek:Sigma}" 0))
343 | "GREEK CAPITAL LETTER SIGMA"
344 | * (unicode-name (char #?"\N{Greek:sigma}" 0))
345 | "GREEK SMALL LETTER SIGMA"
346 | * (setq *scripts-to-try* "Greek")
347 | "Greek"
348 | * (unicode-name (char #?"\N{Sigma}" 0))
349 | "GREEK CAPITAL LETTER SIGMA"
350 | * (unicode-name (char #?"\N{sigma}" 0))
351 | "GREEK SMALL LETTER SIGMA"
352 | 
353 | 354 | Of course, \N won't magically make your Lisp implementation Unicode-aware. You can only use the names of characters that are actually supported by your Lisp. 355 |

356 | If a backslash is followed by an 357 | l or a u (both lowercase) the following 358 | character (if any) is downcased or uppercased respectively. 359 |

360 | * #?"\lFOO"
361 | "fOO"
362 | * #?"\ufoo"
363 | "Foo"
364 | * #?"\l"
365 | ""
366 | 
367 | 368 |

369 | If a backslash is followed by an 370 | L or a U (both uppercase) the following 371 | characters up to \E (uppercase) or another \L or 372 | \U are upcased 374 | or downcased respectively. While \E simply ends the 375 | scope of \L or \U, another \L 376 | or \U will introduce a new round of upcasing or 377 | downcasing. 378 |

379 | * #?"\Ufoo\Ebar"
380 | "FOObar"
381 | * #?"\LFOO\EBAR"
382 | "fooBAR"
383 | * #?"\LFOO\Ubar"
384 | "fooBAR"
385 | * #?"\LFOO"
386 | "foo"
387 | 
388 | These examples may seem trivial but \U and friends might be very helpful if you interpolate strings. 389 | 390 |

391 | If a backslash is followed by a 392 | Q (uppercase) the following characters up to \E (uppercase) are quoted, i.e. every character except for 0 393 | to 9, a to z, A 394 | to Z, and _ (underscore) is preceded by a backslash. Corresponding pairs of \Q and \E can be nested. 395 |

396 | * #?"-\Q-\E-"
397 | "-\\--"
398 | * #?"\Q-\Q-\E-\E"
399 | "\\-\\\\\\-\\-"
400 | * #?"-\Q-"
401 | "-\\-"
402 | 
403 | As you might have noticed, \E is used to end the scope of \Q as well as that of \L and \U. As a consequence, pairs of \Q and \E can be nested between \L or \U and \E and vice-versa but each occurence of \L or \U which is preceded by another \L or \U will immediately end the scope of all enclosed \Q modifiers. Hmm, need an example? 404 |
405 | * #?"\LAa-\QAa-\EAa-\E"
406 | "aa-aa\\-aa-"
407 | * #?"\QAa-\LAa-\EAa-\E"
408 | "Aa\\-aa\\-Aa\\-"
409 | * #?"\U\QAa-\LAa-\EAa-\E"
410 | "AA\\-aa-Aa-" ;; note that only the first hyphen is quoted now
411 | 
412 | 413 | Quoting characters with \Q is especially helpful if you want to interpolate a string verbatim into a regular expression. 414 | 415 |

416 | If a backslash is placed at the end of a line, it works as the tilde newline directive to Common Lisp's FORMAT function. That is, the newline immediately following the backslash and any non-newline whitespace characters after the newline are ignored. This escape sequence allows to break long string literals into several lines of code, so as to maintain convenient line width and indentation of code. 417 |

418 | * #?"@@ -1,11 +1,12 @@\n Th\n-e\n+at\n  quick b\n\
419 |      @@ -22,18 +22,17 @@\n jump\n-s\n+ed\n  over \n\
420 |      -the\n+a\n  laz\n"
421 | "@@ -1,11 +1,12 @@
422 |  Th
423 | -e
424 | +at
425 |   quick b
426 | @@ -22,18 +22,17 @@
427 |  jump
428 | -s
429 | +ed
430 |   over
431 | -the
432 | +a
433 |   laz
434 | "
435 | 
436 | 437 |

438 | All other characters following a backslash are left as is and inserted into the string. This is also true for the backslash itself, for $, @, and - as mentioned above - for the outer closing delimiter. 439 |

440 | * #?"\"\\f\o\o\""
441 | "\"\\foo\""
442 | 
443 | 444 |
 

Interpolation

445 | 446 | If a $ (dollar sign) or @ (at-sign) is seen 447 | and followed by one of { (left curly bracket), [ (left square bracket), < (less than), or ( (left parenthesis) (but see *INNER-DELIMITERS*), the 448 | characters following the bracket are read up to the corresponding closing (right) 449 | bracketing character. They are read as Lisp forms and treated as an implicit 451 | progn the result of which will be inserted into the string at 452 | execution time. (Technically this is done by temporarily making the syntax of the closing right bracketing character in the current 454 | readtable be the same as the syntax of ) (right parenthesis) in the standard readtable and then reading the forms with READ-DELIMITED-LIST.) 455 |

456 | The result of the forms following a $ (dollar sign) is inserted into the string as with PRINC at execution time. The result of the forms following an @ (at-sign) must be a list. The elements of this list are inserted into the string one by one as with PRINC interspersed (or "joined" if you prefer) with the contents of the variable *LIST-DELIMITER* (also inserted as with PRINC). 457 |

458 | Every other $ or @ is inserted into the string as is. 459 |

460 | * (let* ((a "foo")
461 |          (b #\Space)
462 |          (c "bar")
463 |          (d (list a b c))
464 |          (x 40))
465 |     (values #?"$ @"
466 |             #?"$(a)"
467 |             #?"$<a>$[b]"
468 |             #?"\U${a}\E \u${a}"
469 |             (let ((*list-delimiter* #\*))
470 |               #?"@{d}")
471 |             (let ((*list-delimiter* ""))
472 |               #?"@{d}")
473 |             #?"The result is ${(let ((y 2)) (+ x y))}"
474 |             #?"${#?'${a} ${c}'} ${x}"))  ;; note the embedded CL-INTERPOL string
475 | "$ @"
476 | "foo"
477 | "foo "
478 | "FOO Foo"
479 | "foo* *bar"
480 | "foo bar"
481 | "The result is 42"
482 | "foo bar 40"
483 | 
484 | Interpolations are realized by creating code which is evaluated at 485 | execution time. For example, the expansion of 486 | #?"\Q-\l${(let ((x 40)) (+ x 2))}" might look 487 | like this: 488 | 489 |
490 | (with-output-to-string (#:G1098)
491 |   (write-string (cl-ppcre:quote-meta-chars
492 |                  (with-output-to-string (#:G1099)
493 |                    (write-string "-" #:G1099)
494 |                    (let ((#:G1100
495 |                            (format nil "~A"
496 |                                    (progn
497 |                                      (let ((x 40))
498 |                                        (+ x 2))))))
499 |                      (when (plusp (length #:G1100))
500 |                        (setf (char #:G1100 0)
501 |                                (char-downcase (char #:G1100 0))))
502 |                      (write-string #:G1100 #:G1099))))
503 |                 #:G1098))
504 | 
505 | 506 | However, if a string read by CL-INTERPOL does not contain interpolations, it is guaranteed to be expanded into a constant Lisp string. 507 | 508 |
 

Support for CL-PPCRE/Perl regular expressions

509 | 510 | Beyond what has been explained above CL-INTERPOL can support Perl regular expression syntax. This feature is mainly intended for use with CL-PPCRE (version 0.7.0 or higher). The regular expression mode is switched on if the opening outer delimiter is a / (slash) - but see *REGEX-DELIMITERS*. It is also on if there's an r (lowercase or uppercase) in front of the opening outer delimiter. If there's also an x (lowercase or uppercase) in front of the opening outer delimiter (but behind the r if it's there), the string will be read in extended mode (see man perlre for a detailed explanation). In these modes the following things are different from what's described above: 511 | 583 | 584 | If all this seems complicated, just keep in mind that this mode is 585 | meant so that you can feed strings to CL-PPCRE exactly as if you had 587 | written them for Perl (without counting Lisp backslashes 588 | versus Perl backslashes). However, you should not use 589 | both CL-INTERPOL's as well as CL-PPCRE's extended mode at once because 590 | this might lead to errors. (CL-PPCRE's will, e.g., throw away 591 | whitespace which had been escaped in CL-INTERPOL.) 592 |
593 | * (let ((scanner (cl-ppcre:create-scanner " a\\ a " :extended-mode t)))
594 |     (cl-ppcre:scan scanner "a a"))
595 | 0
596 | 3
597 | #()
598 | #()
599 | * (let ((scanner (cl-ppcre:create-scanner #?x/ a\ a /)))
600 |     (cl-ppcre:scan scanner "a a"))
601 | 0
602 | 3
603 | #()
604 | #()
605 | * (let ((scanner (cl-ppcre:create-scanner #?x/ a\ a / :extended-mode t)))
606 |     ;; wrong, because extended mode is applied twice
607 |     (cl-ppcre:scan scanner "a a"))
608 | NIL
609 | 
610 | 611 |
 

The CL-INTERPOL dictionary

612 | 613 | CL-INTERPOL exports the following symbols: 614 | 615 |


[Macro] 616 |
enable-interpol-syntax &key modify-*readtable*=> | 617 | 618 |

619 | 620 |

This is used to enable the reader syntax described above. This macro 621 | expands into an EVAL-WHEN 623 | so that if you use it as a top-level 625 | form in a file to be loaded and/or compiled it'll do what you 626 | expect.

627 | 628 |

If the parameter modify-*readtable* is NIL (the 629 | default) this will push the current 631 | readtable on a stack so that matching calls of 632 | ENABLE-INTERPOL-SYNTAX and DISABLE-INTERPOL-SYNTAX 634 | can nest. Otherwise the current value of *readtable* will 635 | be modified.

636 | 637 |

Note: by default the reader syntax is not 638 | enabled after loading CL-INTERPOL.

639 | 640 |
641 | 642 |


[Macro] 643 |
disable-interpol-syntax => | 644 | 645 |


646 | 647 | This is used to disable the reader syntax described above. This macro 648 | expands into an EVAL-WHEN 650 | so that if you use it as a top-level 652 | form in a file to be loaded and/or compiled it'll do what you 653 | expect. Technically this'll pop a readtable from the stack described above so that matching calls of ENABLE-INTERPOL-SYNTAX and DISABLE-INTERPOL-SYNTAX can nest. If the stack is empty (i.e. when DISABLE-INTERPOL-SYNTAX is called without a preceding call to ENABLE-INTERPOL-SYNTAX), the standard readtable is re-established. 654 | 655 |
656 | 657 |


[Special variable] 658 |
*list-delimiter* 659 | 660 |


661 | 662 | The contents of this variable are inserted between the elements of a list interpolated with @ at execution time. They are inserted as with PRINC. The default value is " " (one space). 663 | 664 |
665 | 666 |


[Special variable] 667 |
*outer-delimiters* 668 | 669 |


670 | 671 | This is a list of acceptable outer delimiters. The elements of this list are either characters or dotted pairs the car and cdr of which are characters. A character denotes a delimiter like ' (apostrophe) which is the opening as well as the closing delimiter. A dotted pair like (#\{ . #\}) denotes a pair of matching bracketing delimiters. The name of this list is exported so that you can customize CL-INTERPOL's behaviour by removing elements from this list, you are advised not to add any - specifically you should not add alphanumeric characters or the backslash. Note that this variable has effect at read time so you probably need to wrap an EVAL-WHEN around forms that change its value. The default value is 673 |
674 | '((#\( . #\))
675 |   (#\{ . #\})
676 |   (#\< . #\>)
677 |   (#\[ . #\])
678 |   #\/ #\| #\" #\' #\#))
679 | 
680 | 681 |
682 | 683 |


[Special variable] 684 |
*inner-delimiters* 685 | 686 |


687 | 688 | This is a list of acceptable delimiters for interpolation. The elements of this list are either characters or dotted pairs the car and cdr of which are characters. A character denotes a delimiter like ' (apostrophe) which is the opening as well as the closing delimiter. A dotted pair like (#\{ . #\}) denotes a pair of matching bracketing delimiters. The name of this list is exported so that you can customize CL-INTERPOL's behaviour by removing elements from this list, you are advised not to add any - specifically you should not add alphanumeric characters or the backslash. Note that this variable has effect at read time so you probably need to wrap an EVAL-WHEN around forms that change its value. The default value is 690 |
691 | '((#\( . #\))
692 |   (#\{ . #\})
693 |   (#\< . #\>)
694 |   (#\[ . #\]))
695 | 
696 | 697 |
698 | 699 |


[Special variable] 700 |
*interpolate-format-directives* 701 | 702 |


703 | 704 | This is a boolean value which determines if the ~ character signals 705 | the start of an inline format directive. When T sequences with this 706 | form: 707 | 708 |
709 | ~paramsX(form)
710 | 
711 | 712 | Will be passed to cl:format, with FORM as 713 | the one and only argument and params and X 714 | are the format directive (with the same syntax as in 715 | cl:format). 716 | 717 | Examples: 718 | 719 |
720 | * (let ((x 42)) #?"An integer: ~D(x) ~X(x) ~8,'0B(x)")
721 | "An integer: 42 2A 00101010"
722 | 
723 | 724 |
725 | 726 |


[Special variable] 727 |
*regex-delimiters* 728 | 729 |


730 | 731 | This is a list of opening outer delimiters which automatically switch CL-INTERPOL's regular expression mode on. The elements of this list are characters. An element of this list must also be an element of *OUTER-DELIMITERS* to have any effect. 732 | Note that this variable has effect at read time so you probably need to wrap an EVAL-WHEN around forms that change its value. The default value is the one-element list '(#\/). 734 | 735 |
736 | 737 |
 

Known issues

738 | 739 |

{n,m} modifiers in extended mode

740 | 741 | CL-INTERPOL treats 'potential' {n,m} modifiers differently from CL-PPCRE or Perl in extended mode if they contain whitespace. CL-INTERPOL will simply remove the whitespace and thus make them valid modifiers for CL-PPCRE while Perl will remove the whitespace but not recognize the character sequence as a modifier. CL-PPCRE behaves like Perl - you decide if this behaviour is sane...:) 742 |
743 | * (let ((scanner (cl-ppcre:create-scanner "^a{3, 3}$" :extended-mode t)))
744 |     (cl-ppcre:scan scanner "aaa"))
745 | NIL
746 | * (let ((scanner (cl-ppcre:create-scanner "^a{3, 3}$" :extended-mode t)))
747 |     (cl-ppcre:scan scanner "a{3,3}"))
748 | 0
749 | 6
750 | #()
751 | #()
752 | * (cl-ppcre:scan #?x/^a{3, 3}$/ "aaa")
753 | 0
754 | 3
755 | #()
756 | #()
757 | * (cl-ppcre:scan #?x/^a{3, 3}$/ "a{3, 3}")
758 | NIL
759 | 
760 | 761 |
 

Acknowledgements

762 | 763 | Thanks to Peter Seibel who had the idea to do this to 764 | make CL-PPCRE more 765 | convenient. Buy his 766 | book!!! 767 | 768 |

769 | $Header: /usr/local/cvsrep/cl-interpol/doc/index.html,v 1.39 2008/07/25 12:52:00 edi Exp $ 770 |

BACK TO THE HOMEPAGE 771 | 772 | 773 | 774 | -------------------------------------------------------------------------------- /packages.lisp: -------------------------------------------------------------------------------- 1 | ;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-USER; Base: 10 -*- 2 | ;;; $Header: /usr/local/cvsrep/cl-interpol/packages.lisp,v 1.11 2008/07/23 15:35:07 edi Exp $ 3 | 4 | ;;; Copyright (c) 2003-2008, Dr. Edmund Weitz. All rights reserved. 5 | 6 | ;;; Redistribution and use in source and binary forms, with or without 7 | ;;; modification, are permitted provided that the following conditions 8 | ;;; are met: 9 | 10 | ;;; * Redistributions of source code must retain the above copyright 11 | ;;; notice, this list of conditions and the following disclaimer. 12 | 13 | ;;; * Redistributions in binary form must reproduce the above 14 | ;;; copyright notice, this list of conditions and the following 15 | ;;; disclaimer in the documentation and/or other materials 16 | ;;; provided with the distribution. 17 | 18 | ;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED 19 | ;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 20 | ;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 21 | ;;; ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY 22 | ;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 23 | ;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE 24 | ;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 25 | ;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 26 | ;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 27 | ;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 28 | ;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 29 | 30 | (in-package :cl-user) 31 | 32 | (defpackage :cl-interpol 33 | (:nicknames :interpol) 34 | (:use :cl :cl-unicode :cl-ppcre) 35 | (:import-from :named-readtables 36 | :defreadtable) 37 | (:export :enable-interpol-syntax 38 | :disable-interpol-syntax 39 | :*list-delimiter* 40 | :*outer-delimiters* 41 | :*inner-delimiters* 42 | :*optional-delimiters-p* 43 | :*interpolate-format-directives* 44 | :interpol-reader)) 45 | -------------------------------------------------------------------------------- /read.lisp: -------------------------------------------------------------------------------- 1 | ;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-INTERPOL; Base: 10 -*- 2 | ;;; $Header: /usr/local/cvsrep/cl-interpol/read.lisp,v 1.31 2008/07/23 15:13:08 edi Exp $ 3 | 4 | ;;; Copyright (c) 2003-2008, Dr. Edmund Weitz. All rights reserved. 5 | 6 | ;;; Redistribution and use in source and binary forms, with or without 7 | ;;; modification, are permitted provided that the following conditions 8 | ;;; are met: 9 | 10 | ;;; * Redistributions of source code must retain the above copyright 11 | ;;; notice, this list of conditions and the following disclaimer. 12 | 13 | ;;; * Redistributions in binary form must reproduce the above 14 | ;;; copyright notice, this list of conditions and the following 15 | ;;; disclaimer in the documentation and/or other materials 16 | ;;; provided with the distribution. 17 | 18 | ;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED 19 | ;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 20 | ;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 21 | ;;; ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY 22 | ;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 23 | ;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE 24 | ;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 25 | ;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 26 | ;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 27 | ;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 28 | ;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 29 | 30 | (in-package :cl-interpol) 31 | 32 | (defun read-while (predicate &key max) 33 | "Reads characters from *STREAM* while PREDICATE returns a true value 34 | for each character. Returns at most MAX characters if MAX is true." 35 | (when (eql max 0) 36 | (return-from read-while "")) 37 | (let ((collector (make-collector))) 38 | (loop for count of-type fixnum from 1 39 | for c = (peek-char*) 40 | while (and (or (not max) 41 | (<= count max)) 42 | c 43 | (funcall predicate c)) 44 | do (vector-push-extend (read-char*) collector) 45 | finally (return collector)))) 46 | 47 | (declaim (inline get-number)) 48 | (defun get-number (&key (radix 10) max) 49 | "Reads and consumes the number *STREAM* is currently looking at and 50 | returns it. Returns NIL if no number could be identified. RADIX is 51 | used as in PARSE-INTEGER. If MAX is not NIL we'll read at most the 52 | next MAX characters." 53 | (parse-integer (read-while (lambda (c) 54 | (digit-char-p c radix)) 55 | :max max) 56 | :radix radix 57 | :junk-allowed t)) 58 | 59 | (defun resolve-unicode-name (name) 60 | "Tries to return a character which was encoded as \\N." 61 | (or (character-named name) 62 | (gethash (canonicalize-name name) *unicode-aliases*))) 63 | 64 | (defun get-char-from-unicode-name () 65 | "Parses and returns a named character after \"\\N\" has already been 66 | read. This function reads from *STREAM*." 67 | (let ((next-char (read-char*))) 68 | (unless (char= next-char #\{) 69 | (signal-reader-error "Expected { after \\N")) 70 | (let ((name (read-while (lambda (c) 71 | (and (char/= c #\}) 72 | (char/= c *term-char*)))))) 73 | (let ((next-char (read-char*))) 74 | (unless (char= next-char #\}) 75 | (signal-reader-error "Expected } after Unicode character name"))) 76 | (or (resolve-unicode-name name) 77 | (signal-reader-error "Could not find character with name '~A'" 78 | name))))) 79 | 80 | (defun unescape-char (regex-mode) 81 | "Convert the characters(s) on *STREAM* following a backslash into a 82 | character which is returned. This function is to be called when the 83 | backslash has already been consumed." 84 | (let ((chr (read-char*))) 85 | ;; certain escape sequences are left as is when in regex mode 86 | (when (or (and (eq regex-mode :in-char-class) 87 | (find chr "pPwWsSdD" :test #'char=)) 88 | (and (eq regex-mode t) 89 | (find chr "kpPwWsSdDbBAZz" :test #'char=))) 90 | (return-from unescape-char 91 | (concatenate 'string "\\" (string chr)))) 92 | (let ((result 93 | (case chr 94 | ((#\N) 95 | ;; named Unicode chars 96 | (get-char-from-unicode-name)) 97 | ((#\c) 98 | ;; \cx means control-x 99 | (when (char= (peek-char*) *term-char*) 100 | (signal-reader-error "String ended after \\c")) 101 | (code-char (logxor #x40 102 | (char-code (char-upcase (read-char*)))))) 103 | ((#\x) 104 | (cond ((char= (peek-char*) #\{) 105 | ;; "wide" hex char, i.e. hexadecimal number is 106 | ;; enclosed in curly brackets 107 | (read-char*) 108 | (prog1 109 | (let ((code (or (get-number :radix 16) 110 | ;; allow for empty string 111 | 0))) 112 | (or (and (< code char-code-limit) 113 | (code-char code)) 114 | (signal-reader-error 115 | "No character for char-code #x~X" code))) 116 | (unless (char= (peek-char*) #\}) 117 | (signal-reader-error "Expected } after hex code")) 118 | (read-char*))) 119 | (t 120 | ;; \x should be followed by a hexadecimal char 121 | ;; code, two digits or less; note that it is 122 | ;; OK if \x is followed by zero digits 123 | (make-char-from-code (get-number :radix 16 :max 2))))) 124 | ((#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9) 125 | (cond ((and (eq regex-mode t) 126 | (char/= chr #\0)) 127 | ;; leave as is if we're in regex mode (and not 128 | ;; within in a character class) 129 | (concatenate 'string "\\" (string chr))) 130 | ((or (char= chr #\8) 131 | (char= chr #\9)) 132 | ;; outside of regex mode "\8" is "8" (in regex 133 | ;; mode it is read like "\08"...) 134 | chr) 135 | (t 136 | (unread-char chr *stream*) 137 | ;; now \x should be followed by an octal char 138 | ;; code, three digits or less 139 | (make-char-from-code (get-number :radix 8 :max 3))))) 140 | ((#\Newline) 141 | (read-while 142 | (lambda (c) 143 | (or (char= c #\Space) 144 | (not (or (graphic-char-p c) (char= c #\Newline)))))) 145 | "") 146 | ;; the following five character names are 147 | ;; 'semi-standard' according to the CLHS but I'm not 148 | ;; aware of any implementation that doesn't implement 149 | ;; them 150 | ((#\t) 151 | #\Tab) 152 | ((#\n) 153 | #\Newline) 154 | ((#\r) 155 | #\Return) 156 | ((#\f) 157 | #\Page) 158 | ((#\b) 159 | #\Backspace) 160 | ((#\a) 161 | (code-char 7)) ; ASCII bell 162 | ((#\e) 163 | (code-char 27)) ; ASCII escape 164 | (otherwise 165 | ;; all other characters aren't affected by a backslash 166 | chr)))) 167 | (cond ((and (characterp result) 168 | ;; some characters must be 'protected' from CL-PPCRE 169 | (or (and (eq regex-mode :in-char-class) 170 | (find result "\\^[]-" :test #'char=)) 171 | (and (eq regex-mode t) 172 | (find result "\\^[]-.$|()*+?" :test #'char=)))) 173 | (concatenate 'string "\\" (string result))) 174 | (t result))))) 175 | 176 | (declaim (inline normal-name-char-p) 177 | (inline never-name-char-p)) 178 | 179 | (defun normal-name-char-p (c) 180 | (and c (or (alphanumericp c) 181 | (member c '(#\_ #\- #\+ #\*))))) 182 | 183 | (defun never-name-char-p (c) 184 | (or (not c) 185 | (get-macro-character c) 186 | (member c '(#\$ #\@)))) 187 | 188 | (defvar quell-warnings-form 189 | #+sbcl '(declare (optimize (sb-ext:inhibit-warnings 3))) 190 | #-sbcl nil 191 | "A declaration form to quiet warnings about unbound variables 192 | within a lexical environment.") 193 | 194 | (defun read-longest-name () 195 | (coerce 196 | (loop until (never-name-char-p (peek-char nil *stream* nil nil t)) 197 | collect (read-char*)) 198 | 'string)) 199 | 200 | (defun read-optional-delimited () 201 | "Read the stuff following an optional delimiter, returning a form 202 | that tries to deal correctly with lexical variables." 203 | (flet ((try-pos (name i form) 204 | (let ((ostr (gensym))) 205 | `(handler-case 206 | (with-output-to-string (,ostr) 207 | (princ ,(read-from-string (subseq name 0 i)) ,ostr) 208 | (princ ,(subseq name i) ,ostr) 209 | ,ostr) 210 | (unbound-variable () ,form))))) 211 | 212 | (loop 213 | with name = (read-longest-name) 214 | with form = `(error ,(format nil "Interpolation error in ~s~%" name)) 215 | with ostr = (gensym) 216 | for i = (position-if-not #'normal-name-char-p name) 217 | then (position-if-not #'normal-name-char-p name :start (1+ i)) 218 | 219 | unless i 220 | return `(let () ,quell-warnings-form 221 | (handler-case 222 | (with-output-to-string (,ostr) 223 | (princ ,(read-from-string name) ,ostr) 224 | ,ostr) 225 | (unbound-variable () ,form))) 226 | 227 | if (> i 0) 228 | do (setq form (try-pos name i form)) 229 | 230 | if (< i (length name)) 231 | do (setq form (try-pos name (1+ i) form))))) 232 | 233 | (declaim (inline read-form)) 234 | (defun read-form (&key (recursive-p t)) 235 | "Reads and returns one or more Lisp forms from *STREAM* if the 236 | character we're looking at is a valid inner delimiter. Otherwise 237 | returns NIL." 238 | (let* ((start-delimiter (peek-char*)) 239 | (end-delimiter (get-end-delimiter start-delimiter *inner-delimiters*))) 240 | (cond ((null end-delimiter) 241 | (if *optional-delimiters-p* 242 | (read-optional-delimited) 243 | nil)) 244 | (t 245 | `(progn 246 | ,@(progn 247 | (read-char*) 248 | (let ((*readtable* (copy-readtable*))) 249 | ;; temporarily change the readtable 250 | (set-syntax-from-char end-delimiter #\)) 251 | (read-delimited-list end-delimiter *stream* recursive-p)))))))) 252 | 253 | (defun read-format-directive () 254 | "Reads and returns a format directive (as a string) along with one 255 | or more lisp forms (as per read-form)." 256 | (let ((format-directive (make-collector))) 257 | (labels ((read-quoted-char () 258 | (if (char= #\' (peek-char*)) 259 | (progn 260 | (vector-push-extend (read-char*) format-directive) 261 | (vector-push-extend (read-char*) format-directive) 262 | t) 263 | nil)) 264 | (read-integer () 265 | (if (member (peek-char*) '(#\- #\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9)) 266 | (progn 267 | (vector-push-extend (read-char*) format-directive) 268 | (loop while (member (peek-char*) '(#\- #\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9)) 269 | do (vector-push-extend (read-char*) format-directive)) 270 | t) 271 | nil)) 272 | (read-modifier () 273 | (loop repeat 2 274 | with found = nil 275 | when (member (peek-char*) '(#\@ #\:)) 276 | do (vector-push-extend (read-char*) format-directive) 277 | and do (setf found t) 278 | finally (return found))) 279 | (read-comma () 280 | (if (char= #\, (peek-char*)) 281 | (progn 282 | (vector-push-extend (read-char*) format-directive) 283 | t) 284 | nil)) 285 | (read-v () 286 | (if (char-equal #\v (peek-char*)) 287 | (progn 288 | (vector-push-extend (read-char*) format-directive) 289 | t) 290 | nil))) 291 | (loop 292 | while (or (read-quoted-char) 293 | (read-integer) 294 | (read-v) 295 | (read-comma)) 296 | finally (read-modifier) 297 | finally (vector-push-extend (read-char*) format-directive)) 298 | format-directive))) 299 | 300 | (defun interpol-reader (*stream* char arg &key (recursive-p t)) 301 | "The actual reader function for the 'sub-character' #\?. 302 | 303 | This function can be used directly outside of a read table by passing `recursive-p` as NIL. 304 | 305 | " 306 | (declare (ignore arg char)) 307 | (let ((*start-char* (read-char*)) 308 | ;; REGEX-MODE is true if we're in regular expression mode; it 309 | ;; can have one of the values :START-OF-CHAR-CLASS, 310 | ;; :START-OF-NEGATED-CHAR-CLASS, or :IN-CHAR-CLASS if we're 311 | ;; inside of a character class or just about to start one - 312 | ;; otherwise the value is T 313 | regex-mode 314 | ;; EXTENDED-MODE is true if we're in extended regular 315 | ;; expression mode 316 | extended-mode) 317 | (when (char-equal *start-char* #\r) 318 | (setq regex-mode t 319 | *start-char* (read-char*))) 320 | (when (char-equal *start-char* #\x) 321 | (setq extended-mode t 322 | *start-char* (read-char*))) 323 | (when (and (not regex-mode) 324 | (find *start-char* *regex-delimiters* :test #'char=)) 325 | (setq regex-mode t)) 326 | (unless regex-mode 327 | (setq extended-mode nil)) 328 | (let ((*term-char* (get-end-delimiter *start-char* 329 | *outer-delimiters* 330 | :errorp t)) 331 | (*pair-level* 0) 332 | (*inner-delimiters* (if regex-mode 333 | (intersection *inner-delimiters* 334 | '((#\{ . #\})) 335 | :test #'equal) 336 | *inner-delimiters*)) 337 | *saw-backslash* 338 | *readtable-copy*) 339 | (prog1 340 | (inner-reader regex-mode extended-mode nil nil :recursive-p recursive-p) 341 | ;; consume the closing outer delimiter 342 | (read-char*))))) 343 | 344 | (defun inner-reader (regex-mode extended-mode quote-mode case-mode &key (recursive-p t)) 345 | "Helper function for INTERPOL-READER which does all the work. May 346 | call itself recursively." 347 | ;; REGEX-MODE and EXTENDED-MODE as described above; QUOTE-MODE is 348 | ;; true if we're inside a \Q scope; CASE-MODE is true if we're 349 | ;; inside a \L or \U scope 350 | (let* ((string-stream (gensym)) ;; the string stream 351 | ;; we use for WITH-OUTPUT-TO-STRING 352 | ;; if this is not a constant string 353 | (collector (make-collector)) ;; we collect 354 | ;; characters into this 355 | ;; extentable string 356 | result ;; a list of all characters, strings, and forms 357 | ;; so far (in reverse order while withing the loop) 358 | handle-next-char) 359 | (block main-loop ;; we need this name so we can leave the LOOP below 360 | (flet ((compute-result () 361 | ;; local function used to leave the loop and compute 362 | ;; the final RESULT 363 | (setq result 364 | (nreverse 365 | (if (plusp (length collector)) 366 | ;; add COLLECTOR if it's not empty 367 | (cons collector result) 368 | result))) 369 | (return-from main-loop)) 370 | (parse-with-case-mode (action-name) 371 | ;; local function used to read while in a \U or \L scope 372 | (let ((string-to-modify 373 | ;; read until \E, \L, \U, or end of string 374 | (inner-reader regex-mode extended-mode regex-mode t))) 375 | (if (stringp string-to-modify) 376 | ;; modify directly if constant string 377 | (funcall action-name string-to-modify) 378 | ;; otherwise create a form to do that at run time 379 | `(write-string 380 | (,action-name ,string-to-modify) 381 | ,string-stream))))) 382 | (loop 383 | (let ((next-char (read-char*))) 384 | (when regex-mode 385 | ;; when in regex mode make sure where we are with 386 | ;; respect to character classes 387 | (setq regex-mode 388 | (case next-char 389 | ((#\[) 390 | (ecase regex-mode 391 | ((:start-of-char-class 392 | :start-of-negated-char-class 393 | :in-char-class) :in-char-class) 394 | ((t) :start-of-char-class))) 395 | ((#\^) 396 | (ecase regex-mode 397 | ((:start-of-char-class) :start-of-negated-char-class) 398 | ((:start-of-negated-char-class 399 | :in-char-class) :in-char-class) 400 | ((t) t))) 401 | ((#\]) 402 | (ecase regex-mode 403 | ((:start-of-char-class 404 | :start-of-negated-char-class) :in-char-class) 405 | ((:in-char-class t) t))) 406 | (otherwise 407 | (ecase regex-mode 408 | ((:start-of-char-class 409 | :start-of-negated-char-class 410 | :in-char-class) :in-char-class) 411 | ((t) t)))))) 412 | (when (and (char= next-char *start-char*) 413 | (char/= *start-char* *term-char*)) 414 | ;; if we see, say, #\( and our closing delimiter is #\) 415 | ;; we increment *PAIR-LEVEL* so the parentheses can next 416 | ;; without ending the string 417 | (incf *pair-level*)) 418 | (let ((interpolation 419 | (cond ((and (char= next-char *term-char*) 420 | (plusp *pair-level*)) 421 | ;; although this is the outer closing 422 | ;; delimiter we don't stop parsing because 423 | ;; we're insided a nested pair of 424 | ;; bracketing characters 425 | (decf *pair-level*) 426 | *term-char*) 427 | ((char= next-char *term-char*) 428 | ;; now we really stop - but we don't 429 | ;; consume the closing delimiter because 430 | ;; we may need it again to end another 431 | ;; scope 432 | (unread-char next-char *stream*) 433 | (compute-result)) 434 | (t 435 | (case next-char 436 | ((#\L) 437 | (cond ((not *saw-backslash*) 438 | ;; a normal #\L, no 'pending' 439 | ;; backslash 440 | #\L) 441 | (case-mode 442 | ;; a backslashed #\L which 443 | ;; we've seen before but we 444 | ;; still have to close at 445 | ;; least one \Q/\L/\E scope 446 | (unread-char #\L *stream*) 447 | (compute-result)) 448 | (t 449 | ;; all scopes are closed, now 450 | ;; read and downcase 'till \E 451 | ;; or somesuch 452 | (setq *saw-backslash* nil) 453 | (parse-with-case-mode 'string-downcase)))) 454 | ((#\U) 455 | ;; see comments for #\L above 456 | (cond ((not *saw-backslash*) 457 | #\U) 458 | (case-mode 459 | (unread-char #\U *stream*) 460 | (compute-result)) 461 | (t 462 | (setq *saw-backslash* nil) 463 | (parse-with-case-mode 'string-upcase)))) 464 | ((#\Space #\Tab #\Linefeed #\Return #\Page) 465 | (cond ((and extended-mode 466 | (not (eq regex-mode :in-char-class))) 467 | ;; in extended mode (if not in 468 | ;; a character class) 469 | ;; whitespace is removed 470 | "") 471 | (t next-char))) 472 | ((#\() 473 | (cond ((and (eq regex-mode t) 474 | (null quote-mode) 475 | (char/= *term-char* #\?) 476 | (eql (peek-char*) #\?)) 477 | ;; this could start an 478 | ;; embedded comment in regex 479 | ;; mode (and we're /not/ 480 | ;; inside of a \Q scope or a 481 | ;; character class) 482 | (read-char*) 483 | (cond ((and (char/= *term-char* #\#) 484 | (eql (peek-char*) #\#)) 485 | ;; yes, it's a 486 | ;; comment, so consume 487 | ;; characters 'till #\) 488 | (read-while 489 | (lambda (char) 490 | (and (char/= char #\)) 491 | (char/= char *term-char*)))) 492 | (cond ((char= (read-char*) *term-char*) 493 | (signal-reader-error 494 | "Incomplete regex comment starting with '(#'")) 495 | ((not (digit-char-p (peek-char*) 16)) 496 | "") 497 | ;; special case 498 | ;; if next 499 | ;; character 500 | ;; could 501 | ;; potentially 502 | ;; continue an 503 | ;; octal or 504 | ;; hexadecimal 505 | ;; representation 506 | (t "(?:)"))) 507 | ;; no, wasn't a comment 508 | (t "(?"))) 509 | (t #\())) 510 | ((#\#) 511 | (cond ((and (eq regex-mode t) 512 | extended-mode 513 | (null quote-mode)) 514 | ;; we're in extended regex 515 | ;; mode and not inside of a \Q 516 | ;; scope or a character class, 517 | ;; so this is a comment and we 518 | ;; consume it 'till #\Newline 519 | ;; or *TERM-CHAR* 520 | (read-while 521 | (lambda (char) 522 | (and (char/= char #\Newline) 523 | (char/= char *term-char*)))) 524 | (when (char= (peek-char*) #\Newline) 525 | (read-char*)) 526 | (cond ((not (digit-char-p (peek-char*) 527 | 16)) 528 | "") 529 | ;; special case, see above 530 | (t "(?:)"))) 531 | (t #\#))) 532 | ((#\\) 533 | (case (peek-char*) 534 | ((#\Q) 535 | ;; \Q - start a new quote scope 536 | (read-char*) 537 | (let ((string-to-quote 538 | (inner-reader regex-mode 539 | extended-mode 540 | t case-mode))) 541 | (if (stringp string-to-quote) 542 | ;; if we got a constant string 543 | ;; we modify it directly 544 | (quote-meta-chars string-to-quote) 545 | ;; otherwise we expand into code 546 | `(write-string 547 | (quote-meta-chars ,string-to-quote) 548 | ,string-stream)))) 549 | ((#\L) 550 | ;; \L - start a new case-modifying 551 | ;; scope 552 | (cond (case-mode 553 | ;; if we're already in 554 | ;; this mode we have to 555 | ;; end all previous scopes 556 | ;; first - we set 557 | ;; *SAW-BACKSLASH* to T so 558 | ;; the #\L is read until 559 | ;; all scopes are finished 560 | (setq *saw-backslash* t) 561 | (compute-result)) 562 | (t 563 | ;; all scopes are closed, now 564 | ;; read and downcase 'till \E 565 | ;; or somesuch 566 | (setq *saw-backslash* nil) 567 | (read-char*) 568 | (parse-with-case-mode 'string-downcase)))) 569 | ((#\U) 570 | ;; see comments for #\L above 571 | (cond (case-mode 572 | (setq *saw-backslash* t) 573 | (compute-result)) 574 | (t 575 | (setq *saw-backslash* nil) 576 | (read-char*) 577 | (parse-with-case-mode 'string-upcase)))) 578 | ((#\E) 579 | ;; \E - ends exactly one scope 580 | (read-char*) 581 | (if (or quote-mode case-mode) 582 | (compute-result) 583 | "")) 584 | ((#\l) 585 | ;; \l - downcase next character 586 | (read-char*) 587 | ;; remember that we have to do this 588 | (setq handle-next-char :downcase) 589 | nil) 590 | ((#\u) 591 | ;; \u - upcase next character 592 | (read-char*) 593 | ;; remember that we have to do this 594 | (setq handle-next-char :upcase) 595 | nil) 596 | (otherwise 597 | ;; otherwise this is a 598 | ;; backslash-escaped character 599 | (unescape-char regex-mode)))) 600 | ((#\~) 601 | ;; #\~ - might be an inline format directive 602 | (if *interpolate-format-directives* 603 | `(format ,string-stream 604 | ,(concatenate 'string "~" (read-format-directive)) 605 | ,@(let ((form (read-form :recursive-p recursive-p))) 606 | (if form 607 | (list form) 608 | '()))) 609 | #\~)) 610 | ((#\$) 611 | ;; #\$ - might be an interpolation 612 | (let ((form (read-form :recursive-p recursive-p))) 613 | (cond ((null form) 614 | ;; no, just dollar sign 615 | #\$) 616 | (handle-next-char 617 | ;; yes, and we have to 618 | ;; modify the first 619 | ;; character 620 | (prog1 621 | (let ((string (gensym))) 622 | `(let ((,string (format nil "~A" 623 | ,form))) 624 | (when (plusp (length ,string)) 625 | (setf (char ,string 0) 626 | (,(if (eq handle-next-char 627 | :downcase) 628 | 'char-downcase 629 | 'char-upcase) 630 | (char ,string 0)))) 631 | (write-string ,string ,string-stream))) 632 | (setq handle-next-char nil))) 633 | (t 634 | ;; no modification, just 635 | ;; insert a form to PRINC 636 | ;; this interpolation 637 | `(princ ,form ,string-stream))))) 638 | ((#\@) 639 | ;; #\Q - might be an interpolation 640 | (let ((form (read-form :recursive-p recursive-p)) 641 | (element (gensym)) 642 | (first (gensym))) 643 | (cond ((null form) 644 | ;; no, just at-sign 645 | #\@) 646 | (handle-next-char 647 | ;; yes, and we have to 648 | ;; modify the first 649 | ;; character 650 | (prog1 651 | (let ((string (gensym))) 652 | `(loop for ,first = t then nil 653 | for ,element in ,form 654 | unless ,first do 655 | (princ *list-delimiter* 656 | ,string-stream) 657 | if ,first do 658 | (let ((,string 659 | (format nil "~A" 660 | ,element))) 661 | (when (plusp (length ,string)) 662 | (setf (char ,string 0) 663 | (,(if (eq handle-next-char 664 | :downcase) 665 | 'char-downcase 666 | 'char-upcase) 667 | (char ,string 0)))) 668 | (write-string ,string ,string-stream)) 669 | else do 670 | (princ ,element ,string-stream))) 671 | (setq handle-next-char nil))) 672 | (t 673 | ;; no modification, just 674 | ;; insert a form to PRINC 675 | ;; this interpolated list 676 | ;; (including the list 677 | ;; delimiters inbetween) 678 | `(loop for ,first = t then nil 679 | for ,element in ,form 680 | unless ,first do (princ *list-delimiter* 681 | ,string-stream) 682 | do (princ ,element ,string-stream)))))) 683 | ;; just a 'normal' character 684 | (otherwise next-char)))))) 685 | (when interpolation 686 | ;; INTERPOLATION is NIL if we just saw #\l or #\u 687 | (when (and handle-next-char 688 | (consp interpolation) 689 | (eq (first interpolation) 690 | 'write-string)) 691 | ;; if we have to upcase or downcase the following 692 | ;; character and we just collected a form (from a 693 | ;; \Q/\L/\U scope) we have to insert code for the 694 | ;; modification 695 | (setf (second interpolation) 696 | (let ((string (gensym))) 697 | `(let ((,string ,(second interpolation))) 698 | (when (plusp (length ,string)) 699 | (setf (char ,string 0) 700 | (,(if (eq handle-next-char :downcase) 701 | 'char-downcase 702 | 'char-upcase) 703 | (char ,string 0)))) 704 | ,string))) 705 | (setq handle-next-char nil)) 706 | (cond ((characterp interpolation) 707 | ;; add one character to COLLECTOR and handle 708 | ;; it according to HANDLE-NEXT-CHAR 709 | (vector-push-extend (case handle-next-char 710 | ((:downcase) 711 | (setq handle-next-char nil) 712 | (char-downcase interpolation)) 713 | ((:upcase) 714 | (setq handle-next-char nil) 715 | (char-upcase interpolation)) 716 | (otherwise 717 | interpolation)) 718 | collector)) 719 | ((stringp interpolation) 720 | ;; add a string to COLLECTOR and handle its 721 | ;; first character according to 722 | ;; HANDLE-NEXT-CHAR 723 | (loop for char across interpolation 724 | do (vector-push-extend (case handle-next-char 725 | ((:downcase) 726 | (setq handle-next-char nil) 727 | (char-downcase char)) 728 | ((:upcase) 729 | (setq handle-next-char nil) 730 | (char-upcase char)) 731 | (otherwise 732 | char)) 733 | collector))) 734 | ((plusp (length collector)) 735 | ;; add code (to be executed at runtime) but 736 | ;; make sure to empty COLLECTOR first 737 | (push collector result) 738 | (push interpolation result) 739 | ;; reset collector 740 | (setf collector (make-collector))) 741 | (t 742 | ;; same but COLLECTOR is empty 743 | (push interpolation result))))))))) 744 | (if (every #'stringp result) 745 | ;; if all elements of RESULT are strings we can return a 746 | ;; constant string 747 | (string-list-to-string result) 748 | ;; otherwise we have to wrap the PRINCs emitted above into a 749 | ;; WITH-OUTPUT-TO-STRING form 750 | `(with-output-to-string (,string-stream) 751 | ,@(loop for interpolation in result 752 | if (stringp interpolation) 753 | collect `(write-string ,interpolation ,string-stream) 754 | else 755 | collect interpolation))))) 756 | 757 | (defun %enable-interpol-syntax (&key (modify-*readtable* nil)) 758 | "Internal function used to enable reader syntax and store current 759 | readtable on stack." 760 | (unless modify-*readtable* 761 | (push *readtable* 762 | *previous-readtables*) 763 | (setq *readtable* (copy-readtable))) 764 | (set-dispatch-macro-character #\# #\? #'interpol-reader) 765 | (values)) 766 | 767 | (defun %disable-interpol-syntax () 768 | "Internal function used to restore previous readtable." 769 | (if *previous-readtables* 770 | (setq *readtable* (pop *previous-readtables*)) 771 | (setq *readtable* (copy-readtable nil))) 772 | (values)) 773 | 774 | (defmacro enable-interpol-syntax (&rest %enable-interpol-syntax-args) 775 | "Enable CL-INTERPOL reader syntax." 776 | `(eval-when (:compile-toplevel :load-toplevel :execute) 777 | (%enable-interpol-syntax ,@%enable-interpol-syntax-args))) 778 | 779 | (defmacro disable-interpol-syntax () 780 | "Restore readtable which was active before last call to 781 | ENABLE-INTERPOL-SYNTAX. If there was no such call, the standard 782 | readtable is used." 783 | `(eval-when (:compile-toplevel :load-toplevel :execute) 784 | (%disable-interpol-syntax))) 785 | 786 | (defreadtable :interpol-syntax 787 | (:merge :standard) 788 | (:dispatch-macro-char #\# #\? #'interpol-reader)) 789 | -------------------------------------------------------------------------------- /specials.lisp: -------------------------------------------------------------------------------- 1 | ;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-INTERPOL; Base: 10 -*- 2 | ;;; $Header: /usr/local/cvsrep/cl-interpol/specials.lisp,v 1.12 2008/07/23 13:58:40 edi Exp $ 3 | 4 | ;;; Copyright (c) 2003-2008, Dr. Edmund Weitz. All rights reserved. 5 | 6 | ;;; Redistribution and use in source and binary forms, with or without 7 | ;;; modification, are permitted provided that the following conditions 8 | ;;; are met: 9 | 10 | ;;; * Redistributions of source code must retain the above copyright 11 | ;;; notice, this list of conditions and the following disclaimer. 12 | 13 | ;;; * Redistributions in binary form must reproduce the above 14 | ;;; copyright notice, this list of conditions and the following 15 | ;;; disclaimer in the documentation and/or other materials 16 | ;;; provided with the distribution. 17 | 18 | ;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED 19 | ;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 20 | ;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 21 | ;;; ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY 22 | ;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 23 | ;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE 24 | ;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 25 | ;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 26 | ;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 27 | ;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 28 | ;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 29 | 30 | (in-package :cl-interpol) 31 | 32 | (defvar *list-delimiter* #\Space 33 | "What is inserted between the elements of a list which is 34 | interpolated by #\@.") 35 | 36 | (defvar *inner-delimiters* '((#\( . #\)) 37 | (#\{ . #\}) 38 | (#\< . #\>) 39 | (#\[ . #\])) 40 | "Legal delimiters for interpolation with #\$ and #\@.") 41 | 42 | (defvar *outer-delimiters* '((#\( . #\)) 43 | (#\{ . #\}) 44 | (#\< . #\>) 45 | (#\[ . #\]) 46 | #\/ #\| #\" #\' #\#) 47 | "Legal outer delimiters for CL-INTERPOL strings.") 48 | 49 | (defvar *regex-delimiters* '(#\/) 50 | "Outer delimiters which automatically enable regex mode.") 51 | 52 | (defvar *unicode-aliases* 53 | (make-hash-table :test #'equalp) 54 | "A hash table which maps Unicode aliases to their real names.") 55 | 56 | (defvar *optional-delimiters-p* nil 57 | "Whether text following $ or @ should interpolate even without a 58 | following delimiter. Lexical variables are handled correctly, 59 | but the rules are somewhat complex -- see the docs for details.") 60 | 61 | (defvar *interpolate-format-directives* nil 62 | "Whether to allow ~X(...) as format control directives in interpolated strings.") 63 | 64 | (defmacro defvar-unbound (variable-name documentation) 65 | "Like DEFVAR, but the variable will be unbound rather than getting 66 | an initial value. This is useful for variables which should have no 67 | global value but might have a dynamically bound value." 68 | ;; stolen from comp.lang.lisp article by 69 | ;; "prunesquallor@comcast.net" 70 | `(eval-when (:load-toplevel :compile-toplevel :execute) 71 | (defvar ,variable-name) 72 | (setf (documentation ',variable-name 'variable) 73 | ,documentation))) 74 | 75 | (defvar-unbound *saw-backslash* 76 | "Whether we have to re-process an \L or \U because it closes several 77 | scopes.") 78 | 79 | (defvar-unbound *pair-level* 80 | "") 81 | 82 | (defvar-unbound *stream* 83 | "Bound to the stream which is read from while parsing a string.") 84 | 85 | (defvar-unbound *start-char* 86 | "Bound to the opening outer delimiter while parsing a string.") 87 | 88 | (defvar-unbound *term-char* 89 | "Bound to the closing outer delimiter while parsing a string.") 90 | 91 | (defvar *previous-readtables* nil 92 | "A stack which holds the previous readtables that have been pushed 93 | here by ENABLE-INTERPOL-SYNTAX.") 94 | 95 | (defvar-unbound *readtable-copy* 96 | "Bound to the current readtable if it has to be temporarily 97 | modified.") 98 | 99 | ;; stuff for Nikodemus Siivola's HYPERDOC 100 | ;; see 101 | ;; and 102 | 103 | (defvar *hyperdoc-base-uri* "http://weitz.de/cl-interpol/") 104 | 105 | (let ((exported-symbols-alist 106 | (loop for symbol being the external-symbols of :cl-interpol 107 | collect (cons symbol 108 | (concatenate 'string 109 | "#" 110 | (string-downcase symbol)))))) 111 | (defun hyperdoc-lookup (symbol type) 112 | (declare (ignore type)) 113 | (cdr (assoc symbol 114 | exported-symbols-alist 115 | :test #'eq)))) 116 | -------------------------------------------------------------------------------- /test/create_perl_tests.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | ### $Header: /usr/local/cvsrep/cl-interpol/test/create_perl_tests.pl,v 1.2 2008/07/23 14:30:45 edi Exp $ 4 | 5 | ### Copyright (c) 2003-2008, Dr. Edmund Weitz. All rights reserved. 6 | 7 | ### Redistribution and use in source and binary forms, with or without 8 | ### modification, are permitted provided that the following conditions 9 | ### are met: 10 | 11 | ### * Redistributions of source code must retain the above copyright 12 | ### notice, this list of conditions and the following disclaimer. 13 | 14 | ### * Redistributions in binary form must reproduce the above 15 | ### copyright notice, this list of conditions and the following 16 | ### disclaimer in the documentation and/or other materials 17 | ### provided with the distribution. 18 | 19 | ### THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED 20 | ### OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 21 | ### WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 22 | ### ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY 23 | ### DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 24 | ### DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE 25 | ### GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 26 | ### INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 27 | ### WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 28 | ### NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 29 | ### SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 | 31 | my @chars = qw(\Q \L \U \E \l \u); 32 | 33 | sub combine { 34 | my $delim = shift; 35 | my @result = (); 36 | foreach my $char (@chars) { 37 | foreach my $string (@_) { 38 | push @result, "$char$delim$string"; 39 | } 40 | } 41 | @result; 42 | } 43 | 44 | sub quote { 45 | local $_ = shift; 46 | s/\\/\\\\/g; 47 | $_; 48 | } 49 | 50 | print <<'HEAD'; 51 | ;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-UNICODE-TEST; Base: 10 -*- 52 | ;;; $Header: /usr/local/cvsrep/cl-interpol/test/create_perl_tests.pl,v 1.2 2008/07/23 14:30:45 edi Exp $ 53 | 54 | ;;; some simple tests for CL-INTERPOL which were generated by a Perl script 55 | 56 | HEAD 57 | 58 | foreach my $a (('Aa-', 'aA-')) { 59 | my $counter = 0; 60 | my @arr = @chars; 61 | while ($counter++ < 4) { 62 | foreach my $str (@arr) { 63 | print "(let ((a \"$a\"))\n"; 64 | my $test = "\${a}$str\${a}"; 65 | print " (string= #?\"$test\" \"" . (quote eval "\"$test\"") . "\"))\n"; 66 | } 67 | @arr = combine '${a}', @arr; 68 | } 69 | } 70 | 71 | print "\n"; 72 | -------------------------------------------------------------------------------- /test/packages.lisp: -------------------------------------------------------------------------------- 1 | ;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-USER; Base: 10 -*- 2 | ;;; $Header: /usr/local/cvsrep/cl-interpol/test/packages.lisp,v 1.2 2008/07/23 13:58:44 edi Exp $ 3 | 4 | ;;; Copyright (c) 2002-2008, Dr. Edmund Weitz. All rights reserved. 5 | 6 | ;;; Redistribution and use in source and binary forms, with or without 7 | ;;; modification, are permitted provided that the following conditions 8 | ;;; are met: 9 | 10 | ;;; * Redistributions of source code must retain the above copyright 11 | ;;; notice, this list of conditions and the following disclaimer. 12 | 13 | ;;; * Redistributions in binary form must reproduce the above 14 | ;;; copyright notice, this list of conditions and the following 15 | ;;; disclaimer in the documentation and/or other materials 16 | ;;; provided with the distribution. 17 | 18 | ;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED 19 | ;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 20 | ;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 21 | ;;; ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY 22 | ;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 23 | ;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE 24 | ;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 25 | ;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 26 | ;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 27 | ;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 28 | ;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 29 | 30 | (in-package :cl-user) 31 | 32 | (defpackage :cl-interpol-test 33 | (:use :cl :cl-interpol :cl-unicode) 34 | (:export :run-all-tests)) 35 | -------------------------------------------------------------------------------- /test/simple: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/edicl/cl-interpol/d4f49d45257be1512db238a5c740e69a5964e93b/test/simple -------------------------------------------------------------------------------- /test/tests.lisp: -------------------------------------------------------------------------------- 1 | ;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-INTERPOL-TEST; Base: 10 -*- 2 | ;;; $Header: /usr/local/cvsrep/cl-interpol/test/tests.lisp,v 1.4 2008/07/23 16:10:13 edi Exp $ 3 | 4 | ;;; Copyright (c) 2008, Dr. Edmund Weitz. All rights reserved. 5 | 6 | ;;; Redistribution and use in source and binary forms, with or without 7 | ;;; modification, are permitted provided that the following conditions 8 | ;;; are met: 9 | 10 | ;;; * Redistributions of source code must retain the above copyright 11 | ;;; notice, this list of conditions and the following disclaimer. 12 | 13 | ;;; * Redistributions in binary form must reproduce the above 14 | ;;; copyright notice, this list of conditions and the following 15 | ;;; disclaimer in the documentation and/or other materials 16 | ;;; provided with the distribution. 17 | 18 | ;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED 19 | ;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 20 | ;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 21 | ;;; ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY 22 | ;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 23 | ;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE 24 | ;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 25 | ;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 26 | ;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 27 | ;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 28 | ;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 29 | 30 | (in-package :cl-interpol-test) 31 | 32 | (defvar *this-file* (load-time-value 33 | (or #.*compile-file-pathname* *load-pathname*)) 34 | "The location of this source file. Needed to find the data files.") 35 | 36 | (defmacro do-tests ((name &optional show-progress-p) &body body) 37 | "Helper macro which repeatedly executes BODY until the code in body 38 | calls the function DONE. It is assumed that each invocation of BODY 39 | will be the execution of one test which returns NIL in case of success 40 | and a list of strings describing errors otherwise. 41 | 42 | The macro prints a simple progress indicator \(one dots for ten tests) 43 | to *STANDARD-OUTPUT* unless SHOW-PROGRESS-P is NIL and returns a true 44 | value iff all tests succeeded. Errors in BODY are caught and reported 45 | \(and counted as failures)." 46 | `(let ((successp t) 47 | (testcount 1)) 48 | (block test-block 49 | (flet ((done () 50 | (return-from test-block successp))) 51 | (format t "~&Test: ~A~%" ,name) 52 | (loop 53 | (when (and ,show-progress-p (zerop (mod testcount 10))) 54 | (format t ".") 55 | (when (zerop (mod testcount 100)) 56 | (terpri)) 57 | (force-output)) 58 | (let ((errors 59 | (handler-case 60 | (progn ,@body) 61 | (error (msg) 62 | (list (format nil "~&got an unexpected error: ~A" msg)))))) 63 | (setq successp (and successp (null errors))) 64 | (when errors 65 | (format t "~&~4@A:~{~& ~A~}~%" testcount errors)) 66 | (incf testcount))))) 67 | successp)) 68 | 69 | (defun simple-tests (&key (file-name 70 | (make-pathname :name "simple" 71 | :type nil :version nil 72 | :defaults *this-file*)) 73 | (external-format '(:latin-1 :eol-style :lf)) 74 | verbose 75 | named-readtables) 76 | "Loops through all the forms in the file FILE-NAME and executes each 77 | of them using EVAL. The CL-INTERPOL syntax is enabled when the forms 78 | are read. It is assumed that each FORM specifies a test which returns 79 | a true value iff it succeeds. Prints each test form to 80 | *STANDARD-OUTPUT* if VERBOSE is true and shows a simple progress 81 | indicator otherwise. EXTERNAL-FORMAT is the FLEXI-STREAMS external 82 | format which is used to read the file. Returns a true value iff all 83 | tests succeeded. 84 | 85 | \(SETQ ...) forms are treated special in that they're just EVALuated 86 | but not counted as tests. The global special variables exported by 87 | CL-INTERPOL \(and some from CL-UNICODE as well) are rebound during the 88 | tests so that they can be safely set in the test files." 89 | (if named-readtables 90 | (named-readtables:in-readtable :interpol-syntax) 91 | (enable-interpol-syntax)) 92 | 93 | (unwind-protect 94 | (with-open-file (binary-stream file-name :element-type 'flex:octet) 95 | (let* ((stream (flex:make-flexi-stream binary-stream :external-format external-format)) 96 | (*package* (find-package :cl-interpol-test)) 97 | (*list-delimiter* *list-delimiter*) 98 | (*outer-delimiters* *outer-delimiters*) 99 | (*inner-delimiters* *inner-delimiters*) 100 | (*optional-delimiters-p* *optional-delimiters-p*) 101 | (*scripts-to-try* *scripts-to-try*) 102 | (*try-abbreviations-p* *try-abbreviations-p*) 103 | (*try-hex-notation-p* *try-hex-notation-p*) 104 | (*try-lisp-names-p* *try-lisp-names-p*) 105 | (*try-unicode1-names-p* *try-unicode1-names-p*)) 106 | (do-tests ((format nil "Simple tests from file ~S" (file-namestring file-name)) 107 | (not verbose)) 108 | (let* ((form (or (read stream nil) (done))) 109 | (setqp (eq (first form) 'setq))) 110 | (when (and verbose (not setqp)) 111 | (format t "~&~S" form)) 112 | (cond (setqp (eval form) nil) 113 | ((eval form) nil) 114 | (t (list (format nil "~S returned NIL" form)))))))) 115 | (if named-readtables 116 | (named-readtables:in-readtable :standard) 117 | (disable-interpol-syntax)))) 118 | 119 | (defun run-all-tests (&key verbose) 120 | "Runs all tests for CL-INTERPOL and returns a true value iff all 121 | tests succeeded. VERBOSE is interpreted by the individual test suites 122 | above." 123 | (let ((successp t)) 124 | (macrolet ((run-test-suite (&body body) 125 | `(unless (progn ,@body) 126 | (setq successp nil)))) 127 | ;; run the automatically generated Perl tests 128 | (loop for named-readtables in (list nil t) 129 | do (format t "~2&Testing with activation through ~A~2%" 130 | (if named-readtables 131 | "named-readtables" 132 | "enable-interpol-syntax")) 133 | (run-test-suite (simple-tests :file-name (make-pathname :name "perltests" 134 | :type nil :version nil 135 | :defaults *this-file*) 136 | :verbose verbose 137 | :named-readtables named-readtables)) 138 | (run-test-suite (simple-tests :verbose verbose 139 | :named-readtables named-readtables)))) 140 | (format t "~2&~:[Some tests failed~;All tests passed~]." successp) 141 | successp)) 142 | -------------------------------------------------------------------------------- /util.lisp: -------------------------------------------------------------------------------- 1 | ;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-INTERPOL; Base: 10 -*- 2 | ;;; $Header: /usr/local/cvsrep/cl-interpol/util.lisp,v 1.12 2008/07/23 14:41:37 edi Exp $ 3 | 4 | ;;; Copyright (c) 2003-2008, Dr. Edmund Weitz. All rights reserved. 5 | 6 | ;;; Redistribution and use in source and binary forms, with or without 7 | ;;; modification, are permitted provided that the following conditions 8 | ;;; are met: 9 | 10 | ;;; * Redistributions of source code must retain the above copyright 11 | ;;; notice, this list of conditions and the following disclaimer. 12 | 13 | ;;; * Redistributions in binary form must reproduce the above 14 | ;;; copyright notice, this list of conditions and the following 15 | ;;; disclaimer in the documentation and/or other materials 16 | ;;; provided with the distribution. 17 | 18 | ;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED 19 | ;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 20 | ;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 21 | ;;; ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY 22 | ;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 23 | ;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE 24 | ;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 25 | ;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 26 | ;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 27 | ;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 28 | ;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 29 | 30 | (in-package :cl-interpol) 31 | 32 | (define-condition simple-reader-error (simple-condition reader-error) 33 | () 34 | (:documentation "A reader error which can be signalled by ERROR.")) 35 | 36 | (defmacro signal-reader-error (format-control &rest format-arguments) 37 | "Like ERROR but signals a SIMPLE-READER-ERROR for the stream 38 | *STREAM*." 39 | `(error 'simple-reader-error 40 | :stream *stream* 41 | :format-control ,format-control 42 | :format-arguments (list ,@format-arguments))) 43 | 44 | (defun string-list-to-string (string-list) 45 | "Concatenates a list of strings to one string." 46 | ;; this function was originally provided by JP Massar for CL-PPCRE; 47 | ;; note that we can't use APPLY with CONCATENATE here because of 48 | ;; CALL-ARGUMENTS-LIMIT 49 | (let ((total-size 0)) 50 | (dolist (string string-list) 51 | (incf total-size (length string))) 52 | (let ((result-string (make-array total-size :element-type 'character)) 53 | (curr-pos 0)) 54 | (dolist (string string-list) 55 | (replace result-string string :start1 curr-pos) 56 | (incf curr-pos (length string))) 57 | result-string))) 58 | 59 | (defun get-end-delimiter (start-delimiter delimiters &key errorp) 60 | "Find the closing delimiter corresponding to the opening delimiter 61 | START-DELIMITER in a list DELIMITERS which is formatted like 62 | *OUTER-DELIMITERS*. If ERRORP is true, signal an error if none was 63 | found, otherwise return NIL." 64 | (loop for element in delimiters 65 | if (eql start-delimiter element) 66 | do (return-from get-end-delimiter start-delimiter) 67 | else if (and (consp element) 68 | (char= start-delimiter (car element))) 69 | do (return-from get-end-delimiter (cdr element))) 70 | (when errorp 71 | (signal-reader-error "~S not allowed as a delimiter here" start-delimiter))) 72 | 73 | (declaim (inline make-collector)) 74 | (defun make-collector () 75 | "Create an empty string which can be extended by 76 | VECTOR-PUSH-EXTEND." 77 | (make-array 0 78 | :element-type 'character 79 | :fill-pointer t 80 | :adjustable t)) 81 | 82 | (declaim (inline make-char-from-code)) 83 | (defun make-char-from-code (number) 84 | "Create character from char-code NUMBER. NUMBER can be NIL which is 85 | interpreted as 0." 86 | ;; Only look at rightmost eight bits in compliance with Perl 87 | (let ((code (logand #o377 (or number 0)))) 88 | (or (and (< code char-code-limit) 89 | (code-char code)) 90 | (signal-reader-error "No character for char-code #x~X" 91 | number)))) 92 | 93 | (declaim (inline lower-case-p*)) 94 | (defun lower-case-p* (char) 95 | "Whether CHAR is a character which has case and is lowercase." 96 | (or (not (both-case-p char)) 97 | (lower-case-p char))) 98 | 99 | (defmacro read-char* () 100 | "Convenience macro because we always read from the same string with 101 | the same arguments." 102 | `(read-char *stream* t nil t)) 103 | 104 | (defmacro peek-char* () 105 | "Convenience macro because we always peek at the same string with 106 | the same arguments." 107 | `(peek-char nil *stream* t nil t)) 108 | 109 | (declaim (inline copy-readtable*)) 110 | (defun copy-readtable* () 111 | "Returns a copy of the readtable which was current when 112 | INTERPOL-READER was invoked. Memoizes its result." 113 | (or *readtable-copy* 114 | (setq *readtable-copy* (copy-readtable)))) 115 | 116 | (declaim (inline nsubvec)) 117 | (defun nsubvec (sequence start &optional (end (length sequence))) 118 | "Return a subvector by pointing to location in original vector." 119 | (make-array (- end start) 120 | :element-type (array-element-type sequence) 121 | :displaced-to sequence 122 | :displaced-index-offset start)) 123 | --------------------------------------------------------------------------------