├── .gitignore ├── LICENSE ├── NEWS ├── TODO ├── adler32.lisp ├── bzip2.lisp ├── chipz.asd ├── conditions.lisp ├── constants.lisp ├── crc32.lisp ├── decompress.lisp ├── doc ├── chipz-doc.txt ├── index.html └── style.css ├── dstate.lisp ├── gzip.lisp ├── inflate-state.lisp ├── inflate.lisp ├── package.lisp ├── stream-fallback.lisp ├── stream.lisp ├── tests.lisp ├── types-and-tables.lisp └── zlib.lisp /.gitignore: -------------------------------------------------------------------------------- 1 | *.fasl 2 | *.FASL 3 | *.ufasl 4 | *.ufsl 5 | *.dx32fsl 6 | *.dx64fsl 7 | *.pfsl 8 | *.dfsl 9 | *.p64fsl 10 | *.d64fsl 11 | *.lx32fsl 12 | *.lx64fsl 13 | *.fx32fsl 14 | *.fx64fsl 15 | *.fas 16 | *.lib 17 | 18 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2004, Nathan Froyd. All rights reserved. 2 | 3 | Redistribution and use in source and binary forms, with or without 4 | modification, are permitted provided that the following conditions are 5 | met: 6 | 7 | * Redistributions of source code must retain the above copyright 8 | notice, this list of conditions and the following disclaimer. 9 | 10 | * Redistributions in binary form must reproduce the above copyright 11 | notice, this list of conditions and the following disclaimer in the 12 | documentation and/or other materials provided with the distribution. 13 | 14 | * Neither the name of Nathan Froyd nor the names of the contributors to 15 | this software may be used to endorse or promote products derived from 16 | this software without specific prior written permission. 17 | 18 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS 19 | IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED 20 | TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A 21 | PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER 22 | OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, 23 | EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 24 | PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR 25 | PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF 26 | LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 27 | NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 28 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 29 | -------------------------------------------------------------------------------- /NEWS: -------------------------------------------------------------------------------- 1 | hey emacs, show me an -*- mode: outline -*- 2 | 3 | * Version 0.8, released 2013-01-14 4 | 5 | ** bug fixes 6 | 7 | Stack overflow problems on several implementations have been fixed. 8 | (Thanks to Zach Beane for pointing this out.) 9 | 10 | Anton Kovalenko contributed several fixes to the bzip2 code. 11 | 12 | Several bugs have been fixed with the Gray streams support. (Thanks to 13 | Anton Kovalenko and Felix Lange for the fixes and zort on Github for a 14 | bug report.) 15 | 16 | Gray streams support works with recent versions of Allegro Common Lisp. 17 | (Thanks to Dave Cooper.) 18 | 19 | ** improvements 20 | 21 | Consing when computing the CRC32 checksum for gzip has been reduced. 22 | (Thanks to Zach Beane for permission to pull his code from Salza.) 23 | 24 | MAKE-DECOMPRESSING-STREAM now singals an error if Gray streams are not 25 | supported. (Thanks to Felix Lange.) 26 | 27 | ** new features 28 | 29 | A new method on DECOMPRESS has been added for decompressing files into 30 | memory. 31 | 32 | CLISP now supports decompressing streams. (Thanks to Felix Lange.) 33 | 34 | * Version 0.7.4, released 2009-10-26 35 | 36 | ** bug fixes 37 | 38 | CRC32 checksums are now calculated properly. (Thanks to Sven Van 39 | Caekenberghe.) 40 | 41 | ** improvements 42 | 43 | READ-SEQUENCE is now supported on Gray streams, which may result in a 44 | significant performance boost. 45 | 46 | * Version 0.7.3, released 2009-05-09 47 | 48 | ** bug fixes 49 | 50 | Checksum errors now throw the correct error, rather than complaining 51 | about an unknown checksum-mismatch error. (Thanks to _3b on #lisp.) 52 | 53 | Fixed an error when decompressing raw deflate data. (Thanks to _3b on 54 | #lisp.) 55 | 56 | Fixed an error in rare cases when reading checksums. (Thanks to _3b on 57 | #lisp.) 58 | 59 | * Version 0.7.2, released 2008-12-20 60 | 61 | ** bug fixes 62 | 63 | Gray streams now work correctly. (Thanks to Austin Haas.) 64 | 65 | * Version 0.7.1, released 2008-12-17 66 | 67 | ** bug fixes 68 | 69 | Fixed some errors related to confusion between keywords and CHIPZ 70 | symbols. (Thanks to Austin Haas and Tomas Zellerin.) 71 | 72 | * Version 0.7.0, released 2008-10-24 73 | 74 | ** new features 75 | 76 | bzip2 decompression is now included. Please see the documentation for 77 | more details. 78 | 79 | ** improvements 80 | 81 | DECOMPRESS now supports the keyword argument :BUFFER-SIZE. Please see 82 | the documentation for more details. 83 | 84 | Checksum verification is now performed for gzip and zlib data. 85 | 86 | * Version 0.6.1, released 2008-10-03 87 | 88 | ** bug fixes 89 | 90 | Various fixes have been applied to increase Chipz's portability. 91 | (Thanks to Zach Beane.) 92 | 93 | * Version 0.6.0, released 2008-09-27 94 | 95 | ** new features 96 | 97 | Gray streams are now supported on Lispworks, CMUCL, Allegro, and 98 | OpenMCL. 99 | 100 | There has been a significant amount of optimization work done on the 101 | library and decompression shows better than order-of-magnitude speedups 102 | on SBCL. 103 | 104 | * Version 0.5.1, released 2008-02-09 105 | 106 | ** bug fixes 107 | 108 | DECOMPRESS now handles :INPUT-END properly in all cases. (Thanks to 109 | Jeremy English.) 110 | 111 | * Version 0.5, released 2008-01-02 112 | 113 | Completely redid the structure of the library. The API is now modeled 114 | after that of `zlib', with convenience functions added. 115 | 116 | Gray streams interfaces added. 117 | 118 | * Version 0.1.1, released 2006-01-27 119 | 120 | Fixed PROCESS-NO-COMPRESS-BLOCK to read from the underlying stream, 121 | rather than the bitstream. (Thanks to Eric Marsden.) 122 | 123 | * Version 0.1, released 2006-01-27 124 | 125 | Initial release. 126 | -------------------------------------------------------------------------------- /TODO: -------------------------------------------------------------------------------- 1 | * provide some way to look at gzip/zlib header information 2 | -------------------------------------------------------------------------------- /adler32.lisp: -------------------------------------------------------------------------------- 1 | ;;; adler32.lisp -- computing adler32 checksums (rfc1950) 2 | 3 | (in-package :chipz) 4 | 5 | (defstruct (adler32 6 | (:copier copy-adler32)) 7 | (s1 1 :type fixnum) 8 | (s2 0 :type fixnum)) 9 | 10 | (defun update-adler32 (state vector start end) 11 | (declare (type simple-octet-vector vector)) 12 | (declare (type index start end)) 13 | ;; many thanks to Xach for his code from Salza. 14 | (let ((length (- end start)) 15 | (i 0) 16 | (k 0) 17 | (s1 (adler32-s1 state)) 18 | (s2 (adler32-s2 state))) 19 | (declare (type index i k length) 20 | (type fixnum s1 s2)) 21 | (unless (zerop length) 22 | (tagbody 23 | loop 24 | (setf k (min 16 length)) 25 | (decf length k) 26 | sum 27 | (setf s1 (+ (aref vector (+ start i)) s1)) 28 | (setf s2 (+ s1 s2)) 29 | (decf k) 30 | (incf i) 31 | (unless (zerop k) 32 | (go sum)) 33 | (setf s1 (mod s1 adler32-modulo)) 34 | (setf s2 (mod s2 adler32-modulo)) 35 | (unless (zerop length) 36 | (go loop)) 37 | end 38 | (setf (adler32-s1 state) s1 39 | (adler32-s2 state) s2) 40 | (return-from update-adler32 state))))) 41 | 42 | (defun produce-adler32 (state) 43 | (logior (ash (adler32-s2 state) 16) 44 | (adler32-s1 state))) 45 | -------------------------------------------------------------------------------- /bzip2.lisp: -------------------------------------------------------------------------------- 1 | (in-package :chipz) 2 | 3 | ;;; bzip2's decompress.c looks relatively simple, but a great deal of 4 | ;;; complexity and cleverness is hidden behind C preprpocessor macro. 5 | ;;; The single biggest help in understand what is going on behind the 6 | ;;; macros is to read "Coroutines in C" by Simon Tatham: 7 | ;;; 8 | ;;; http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html 9 | ;;; 10 | ;;; decompress.c is using the same technique described in the paper, 11 | ;;; although with a slightly different implementation. 12 | ;;; 13 | ;;; Lisp, fortunately/alas, does not admit the same sort of techniques 14 | ;;; that C does--at least not expressed exactly the same way. So our 15 | ;;; translation naturally differs in some places. For example, to make 16 | ;;; it easier to figure out how much state we have to preserve, we 17 | ;;; choose to read more in at one time than decompress.c--the magic 18 | ;;; number header all at once or the bits for the mapping table in 19 | ;;; larger chunks than 1 bit at a time, for instance. 20 | 21 | ;;; Reading things in larger chunks than bits means that we have to do 22 | ;;; bit-reversal of various quantities. 23 | 24 | (defun reverse-ub4 (x) 25 | (let ((table (load-time-value (make-array 16 :element-type 'fixnum 26 | :initial-contents '(0 8 4 12 27 | 2 10 6 14 28 | 1 9 5 13 29 | 3 11 7 15))))) 30 | (aref table x))) 31 | 32 | (defun reverse-ub8 (x) 33 | (logior (ash (reverse-ub4 (ldb (byte 4 0) x)) 4) 34 | (reverse-ub4 (ldb (byte 4 4) x)))) 35 | 36 | (defun reverse-ub16 (x) 37 | (logior (ash (reverse-ub8 (ldb (byte 8 0) x)) 8) 38 | (reverse-ub8 (ldb (byte 8 8) x)))) 39 | 40 | (defvar *dummy-vec* (make-array #.+bz-max-alpha-size+ :element-type '(unsigned-byte 32))) 41 | 42 | (defstruct (bzip2-state 43 | (:include decompression-state) 44 | (:constructor %make-bzip2-state)) 45 | ;; For doing the final run-length decoding. 46 | (out-ch 0 :type (unsigned-byte 8)) 47 | (out-len 0 :type (integer 0 260)) 48 | (block-randomized-p nil) 49 | (rntogo 0 :type (unsigned-byte 32)) 50 | (rntpos 0 :type (unsigned-byte 32)) 51 | 52 | (100k-block-size 1 :type (integer 1 9)) 53 | (small-decompression-p nil) 54 | (current-block-number 0) 55 | 56 | ;; For undoing the Burrows-Wheeler transform. */ 57 | (original-pointer 0) 58 | (t-position 0 :type (integer 0 (900000))) 59 | (k0 0) 60 | (unzftab (make-array 256 :element-type '(unsigned-byte 32)) 61 | :type (simple-array (unsigned-byte 32) (256))) 62 | (n-blocks-used 0) 63 | (cftab (make-array 257 :element-type '(unsigned-byte 32)) 64 | :type (simple-array (unsigned-byte 32) (257))) 65 | (cftab-copy (make-array 257 :element-type '(unsigned-byte 32)) 66 | :type (simple-array (unsigned-byte 32) (257))) 67 | 68 | ;; For undoing the Burrows-Wheeler transform (FAST). 69 | (tt (make-array 0 :element-type '(unsigned-byte 32)) 70 | :type (simple-array (unsigned-byte 32) (*))) 71 | 72 | ;; Stored and calculated CRCs. 73 | (stored-block-crc 0 :type (unsigned-byte 32)) 74 | (stored-combined-crc 0 :type (unsigned-byte 32)) 75 | (calculated-block-crc #xffffffff :type (unsigned-byte 32)) 76 | (calculated-combined-crc 0 :type (unsigned-byte 32)) 77 | 78 | ;; Map of bytes used in block ("mapping table"). 79 | (n-in-use 0 :type (integer 0 256)) 80 | (in-use (make-array 256 :initial-element nil) 81 | :type (simple-array t (256))) 82 | ;; This was a byte array; we have chosen to make it a simple integer 83 | ;; and index it with LOGBITP. 84 | (in-use-16 0 :type (unsigned-byte 16)) 85 | (seq-to-unseq (make-array 256 :element-type '(unsigned-byte 8)) 86 | :type (simple-array (unsigned-byte 8) (256))) 87 | 88 | ;; For decoding the MTF values. 89 | (mtfa (make-array +mtfa-size+ :element-type '(unsigned-byte 8)) 90 | :type (simple-array (unsigned-byte 8) (#.+mtfa-size+))) 91 | (mtfbase (make-array (/ 256 +mtfl-size+) :element-type '(unsigned-byte 16)) 92 | :type (simple-array (unsigned-byte 16) (#.(/ 256 +mtfl-size+)))) 93 | (selector (make-array +bz-max-selectors+ :element-type '(unsigned-byte 8)) 94 | :type (simple-array (unsigned-byte 8) (#.+bz-max-selectors+))) 95 | (selector-mtf (make-array +bz-max-selectors+ :element-type '(unsigned-byte 8)) 96 | :type (simple-array (unsigned-byte 8) (#.+bz-max-selectors+))) 97 | (len (make-array '(#.+bz-n-groups+ #.+bz-max-alpha-size+) 98 | :element-type '(unsigned-byte 8)) 99 | :type (simple-array (unsigned-byte 8) (#.+bz-n-groups+ #.+bz-max-alpha-size+))) 100 | (mtf-continuation nil :type (or null function)) 101 | 102 | (limit #1=(let ((w (make-array +bz-n-groups+))) 103 | (dotimes (i +bz-n-groups+ w) 104 | (setf (aref w i) (make-array +bz-max-alpha-size+ 105 | :element-type '(unsigned-byte 32))))) 106 | :type (simple-array t (#.+bz-n-groups+))) 107 | (base #1# 108 | :type (simple-array t (#.+bz-n-groups+))) 109 | (perm #1# 110 | :type (simple-array t (#.+bz-n-groups+))) 111 | (min-lengths (make-array #.+bz-n-groups+ :element-type '(unsigned-byte 32)) 112 | :type (simple-array (unsigned-byte 32) (#.+bz-n-groups+))) 113 | 114 | ;; Save variables for scalars in the decompression code. 115 | (i 0) 116 | (j 0) 117 | (alpha-size 0 :type (integer 0 258)) 118 | (n-groups 0) 119 | (n-selectors 0) 120 | (EOB 0 :type (integer 0 257)) 121 | ;; FIXME: check on the declarations for these three. 122 | (group-number 0 :type fixnum) 123 | (group-position 0 :type fixnum) 124 | (lval 0 :type fixnum) 125 | (nblockMAX 0 :type (integer 0 900000)) 126 | (nblock 0 :type (integer 0 (900000))) 127 | (es 0 :type fixnum) 128 | (N 0 :type fixnum) 129 | (curr 0 :type (integer 0 20)) 130 | (zn 0 :type (integer 0 20)) 131 | (zvec 0 :type (integer 0 #.(expt 2 20))) 132 | (g-minlen 0 :type (integer 0 23)) 133 | (g-limit *dummy-vec* 134 | :type (simple-array (unsigned-byte 32) (#.+bz-max-alpha-size+))) 135 | (g-base *dummy-vec* 136 | :type (simple-array (unsigned-byte 32) (#.+bz-max-alpha-size+))) 137 | (g-perm *dummy-vec* 138 | :type (simple-array (unsigned-byte 32) (#.+bz-max-alpha-size+)))) 139 | 140 | (defmethod print-object ((object bzip2-state) stream) 141 | (print-unreadable-object (object stream) 142 | (format stream "Bzip2 state bits: ~X/~D input: ~D/~D output ~D/~D" 143 | (bzip2-state-bits object) 144 | (bzip2-state-n-bits object) 145 | (bzip2-state-input-index object) 146 | (bzip2-state-input-end object) 147 | (bzip2-state-output-index object) 148 | (bzip2-state-output-end object)))) 149 | 150 | (defun make-maps (state) 151 | (declare (type bzip2-state state)) 152 | (loop with n-in-use = 0 153 | with in-use-table = (bzip2-state-in-use state) 154 | with seq-to-unseq = (bzip2-state-seq-to-unseq state) 155 | for i from 0 below 256 156 | when (aref in-use-table i) 157 | do (setf (aref seq-to-unseq n-in-use) i 158 | n-in-use (1+ n-in-use)) 159 | finally 160 | (return (setf (bzip2-state-n-in-use state) n-in-use)))) 161 | 162 | (defun make-decode-tables (state group min-len max-len alpha-size) 163 | (declare (type bzip2-state state)) 164 | (let* ((limit (aref (bzip2-state-limit state) group)) 165 | (base (aref (bzip2-state-base state) group)) 166 | (perm (aref (bzip2-state-perm state) group)) 167 | (len (bzip2-state-len state)) 168 | (rmi (array-row-major-index len group 0))) 169 | (loop with pp = 0 170 | for i from min-len to max-len 171 | do (dotimes (j alpha-size) 172 | (when (= (row-major-aref len (+ rmi j)) i) 173 | (setf (aref perm pp) j) 174 | (incf pp)))) 175 | (loop for i from 0 below +bz-max-code-len+ 176 | do (setf (aref base i) 0 177 | (aref limit i) 0)) 178 | (loop for i from 0 below alpha-size 179 | do (incf (aref base (1+ (row-major-aref len (+ i rmi)))))) 180 | (loop for i from 1 below +bz-max-code-len+ 181 | do (incf (aref base i) 182 | (aref base (1- i)))) 183 | (loop with vec = 0 184 | for i from min-len to max-len 185 | do (incf vec (- (aref base (1+ i)) 186 | (aref base i))) 187 | (setf (aref limit i) (1- vec) 188 | vec (ash vec 1))) 189 | (loop for i from (+ min-len 1) to max-len 190 | do (setf (aref base i) 191 | (- (ash (1+ (aref limit (1- i))) 1) 192 | (aref base i)))))) 193 | 194 | (defun undo-rle-obuf-to-output (state) 195 | (declare (optimize speed)) 196 | (cond 197 | ((bzip2-state-block-randomized-p state) 198 | (error 'bzip2-randomized-blocks-unimplemented)) 199 | (t 200 | (let ((calculated-block-crc (bzip2-state-calculated-block-crc state)) 201 | (out-ch (bzip2-state-out-ch state)) 202 | (out-len (bzip2-state-out-len state)) 203 | (n-blocks-used (bzip2-state-n-blocks-used state)) 204 | (k0 (bzip2-state-k0 state)) 205 | (k1 0) 206 | (tt (bzip2-state-tt state)) 207 | (t-position (bzip2-state-t-position state)) 208 | (nblockpp (1+ (bzip2-state-nblock state))) 209 | (output (bzip2-state-output state)) 210 | (index (bzip2-state-output-index state)) 211 | (end (bzip2-state-output-end state))) 212 | (declare (type (unsigned-byte 32) calculated-block-crc)) 213 | (declare (type (integer 0 260) out-len)) 214 | (declare (type (unsigned-byte 8) k0 k1)) 215 | (declare (type (integer 0 900000) n-blocks-used nblockpp)) 216 | (declare (type (unsigned-byte 32) t-position)) 217 | (macrolet ((get-fast () 218 | `(prog2 219 | (setf t-position (aref tt t-position)) 220 | (logand t-position #xff) 221 | (setf t-position (ash t-position -8))))) 222 | (tagbody 223 | START 224 | ;; "try to finish existing run" 225 | (when (zerop out-len) 226 | (go GRAB-MORE)) 227 | (loop 228 | (when (= index end) 229 | (go FINISH)) 230 | (when (= out-len 1) 231 | (go LEN-EQUAL-ONE)) 232 | (setf (aref output index) out-ch) 233 | (setf calculated-block-crc 234 | (logand #xffffffff 235 | (logxor (ash calculated-block-crc 8) 236 | (aref +bzip2-crc32-table+ 237 | (logxor (ash calculated-block-crc -24) out-ch))))) 238 | (decf out-len) 239 | (incf index)) 240 | LEN-EQUAL-ONE 241 | (when (= index end) 242 | (setf out-len 1) 243 | (go FINISH)) 244 | (setf (aref output index) out-ch) 245 | (setf calculated-block-crc 246 | (logand #xffffffff 247 | (logxor (ash calculated-block-crc 8) 248 | (aref +bzip2-crc32-table+ 249 | (logxor (ash calculated-block-crc -24) out-ch))))) 250 | (incf index) 251 | GRAB-MORE 252 | ;; "Only caused by corrupt data stream?" 253 | (when (> n-blocks-used nblockpp) 254 | (return-from undo-rle-obuf-to-output t)) 255 | (when (= n-blocks-used nblockpp) 256 | (setf out-len 0) 257 | (go FINISH)) 258 | (setf out-ch k0) 259 | 260 | (setf k1 (get-fast)) 261 | (incf n-blocks-used) 262 | (unless (= k1 k0) 263 | (setf k0 k1) 264 | (go LEN-EQUAL-ONE)) 265 | (when (= n-blocks-used nblockpp) 266 | (go LEN-EQUAL-ONE)) 267 | 268 | (setf out-len 2) 269 | (setf k1 (get-fast)) 270 | (incf n-blocks-used) 271 | (when (= n-blocks-used nblockpp) 272 | (go CONTINUE)) 273 | (unless (= k1 k0) 274 | (setf k0 k1) 275 | (go CONTINUE)) 276 | 277 | (setf out-len 3) 278 | (setf k1 (get-fast)) 279 | (incf n-blocks-used) 280 | (when (= n-blocks-used nblockpp) 281 | (go CONTINUE)) 282 | (unless (= k1 k0) 283 | (setf k0 k1) 284 | (go CONTINUE)) 285 | 286 | (setf k1 (get-fast)) 287 | (incf n-blocks-used) 288 | (setf out-len (+ k1 4)) 289 | (setf k0 (get-fast)) 290 | (incf n-blocks-used) 291 | CONTINUE 292 | (go START) 293 | FINISH) 294 | 295 | #+nil 296 | (incf (bzip2-state-total-out state) 297 | (- index (bzip2-state-output-index state) )) 298 | ;; Restore cached values. 299 | (setf (bzip2-state-calculated-block-crc state) calculated-block-crc 300 | (bzip2-state-out-ch state) out-ch 301 | (bzip2-state-out-len state) out-len 302 | (bzip2-state-n-blocks-used state) n-blocks-used 303 | (bzip2-state-k0 state) k0 304 | (bzip2-state-t-position state) t-position 305 | (bzip2-state-output-index state) index) 306 | nil))))) 307 | 308 | ;;; decompress.c has various logic relating to whether the user has 309 | ;;; chosen "small" decompression, which uses less memory. We're just 310 | ;;; going to be memory-intensive and always pick the large option. Maybe 311 | ;;; someday we can come back and add the small option. 312 | 313 | (defun %bzip2-state-machine (state) 314 | (declare (type bzip2-state state)) 315 | (declare (optimize (speed 3) (debug 1) (space 0) (compilation-speed 0))) 316 | ;; See the enormous comment in %INFLATE-STATE-MACHINE for what's going 317 | ;; on here. 318 | (macrolet ((transition-to (next-state) 319 | `(progn 320 | (setf (bzip2-state-state state) #',next-state) 321 | #+(or sbcl cmu) 322 | (,next-state state)))) 323 | (labels ( 324 | (read-bits (n state) 325 | (declare (type (integer 0 32) n)) 326 | (declare (type bzip2-state state)) 327 | (prog1 328 | ;; We don't use (BYTE N (- ...)) here because doing it 329 | ;; this way is ~10% faster on SBCL. 330 | (ldb (byte n 0) 331 | (ash (bzip2-state-bits state) 332 | (the (integer -31 0) 333 | (- n (bzip2-state-n-bits state))))) 334 | (decf (bzip2-state-n-bits state) n))) 335 | 336 | (ensure-bits (n state) 337 | (declare (type (integer 0 32) n)) 338 | (declare (type bzip2-state state)) 339 | (let ((bits (bzip2-state-bits state)) 340 | (n-bits (bzip2-state-n-bits state)) 341 | (input-index (bzip2-state-input-index state))) 342 | (declare (type (unsigned-byte 32) bits)) 343 | (loop while (< n-bits n) 344 | when (>= input-index (bzip2-state-input-end state)) 345 | do (progn 346 | (setf (bzip2-state-bits state) bits 347 | (bzip2-state-n-bits state) n-bits 348 | (bzip2-state-input-index state) input-index) 349 | (throw 'bzip2-done nil)) 350 | do (let ((byte (aref (bzip2-state-input state) input-index))) 351 | (declare (type (unsigned-byte 8) byte)) 352 | (setf bits 353 | (logand #xffffffff (logior (ash bits 8) byte))) 354 | (incf n-bits 8) 355 | (incf input-index)) 356 | finally (setf (bzip2-state-bits state) bits 357 | (bzip2-state-n-bits state) n-bits 358 | (bzip2-state-input-index state) input-index)))) 359 | 360 | (ensure-and-read-bits (n state) 361 | (ensure-bits n state) 362 | (read-bits n state)) 363 | 364 | (bzip2-header (state) 365 | (declare (type bzip2-state state)) 366 | (let ((header-field (ensure-and-read-bits 32 state))) 367 | (declare (type (unsigned-byte 32) header-field)) 368 | (unless (and (= (ldb (byte 8 24) header-field) +bz-header-b+) 369 | (= (ldb (byte 8 16) header-field) +bz-header-z+) 370 | (= (ldb (byte 8 8) header-field) +bz-header-h+)) 371 | (error 'invalid-bzip2-magic)) 372 | (let ((block-size-magic-byte (ldb (byte 8 0) header-field))) 373 | (unless (<= (+ +bz-header-0+ 1) 374 | block-size-magic-byte 375 | (+ +bz-header-0+ 9)) 376 | (error 'invalid-bzip2-magic)) 377 | (setf (bzip2-state-100k-block-size state) (- block-size-magic-byte 378 | +bz-header-0+)) 379 | ;; BZIP2 SMALL 380 | (setf (bzip2-state-tt state) 381 | (make-array (* (bzip2-state-100k-block-size state) +100k+) 382 | :element-type '(unsigned-byte 32))) 383 | (transition-to bzip2-block-header1)))) 384 | 385 | (bzip2-block-header1 (state) 386 | (declare (type bzip2-state state)) 387 | (let ((byte (ensure-and-read-bits 8 state))) 388 | (case byte 389 | (#x17 (transition-to bzip2-end-header2)) 390 | (#x31 (transition-to bzip2-block-header2)) 391 | (t (error 'invalid-bzip2-data))))) 392 | 393 | (bzip2-block-header2 (state) 394 | (declare (type bzip2-state state)) 395 | (let ((byte (ensure-and-read-bits 8 state))) 396 | (if (= byte #x41) 397 | (transition-to bzip2-block-header3) 398 | (error 'invalid-bzip2-data)))) 399 | 400 | (bzip2-block-header3 (state) 401 | (declare (type bzip2-state state)) 402 | (let ((byte (ensure-and-read-bits 8 state))) 403 | (if (= byte #x59) 404 | (transition-to bzip2-block-header4) 405 | (error 'invalid-bzip2-data)))) 406 | 407 | (bzip2-block-header4 (state) 408 | (declare (type bzip2-state state)) 409 | (let ((byte (ensure-and-read-bits 8 state))) 410 | (if (= byte #x26) 411 | (transition-to bzip2-block-header5) 412 | (error 'invalid-bzip2-data)))) 413 | 414 | (bzip2-block-header5 (state) 415 | (declare (type bzip2-state state)) 416 | (let ((byte (ensure-and-read-bits 8 state))) 417 | (if (= byte #x53) 418 | (transition-to bzip2-block-header6) 419 | (error 'invalid-bzip2-data)))) 420 | 421 | (bzip2-block-header6 (state) 422 | (declare (type bzip2-state state)) 423 | (let ((byte (ensure-and-read-bits 8 state))) 424 | (unless (= byte #x59) 425 | (error 'invalid-bzip2-data)) 426 | (incf (bzip2-state-current-block-number state)) 427 | (transition-to bzip2-block-crc32))) 428 | 429 | (bzip2-block-crc32 (state) 430 | (declare (type bzip2-state state)) 431 | (let ((crc32-hi (ensure-and-read-bits 16 state)) 432 | (crc32-lo (ensure-and-read-bits 16 state))) 433 | (setf (bzip2-state-stored-block-crc state) 434 | (logior (ash crc32-hi 16) crc32-lo)) 435 | (transition-to bzip2-block-randombit))) 436 | 437 | (bzip2-block-randombit (state) 438 | (declare (type bzip2-state state)) 439 | (let ((randomized-p (ensure-and-read-bits 1 state))) 440 | (setf (bzip2-state-block-randomized-p state) (= randomized-p 1)) 441 | (transition-to bzip2-original-pointer))) 442 | 443 | (bzip2-original-pointer (state) 444 | (declare (type bzip2-state state)) 445 | (let ((original-pointer (ensure-and-read-bits 24 state))) 446 | (unless (<= 0 original-pointer 447 | (+ 10 (* (bzip2-state-100k-block-size state) +100k+))) 448 | (error 'invalid-bzip2-data)) 449 | (setf (bzip2-state-original-pointer state) original-pointer) 450 | (transition-to bzip2-mapping-table1))) 451 | 452 | (bzip2-mapping-table1 (state) 453 | (declare (type bzip2-state state)) 454 | (let ((in-use-16 (reverse-ub16 (ensure-and-read-bits 16 state)))) 455 | (setf (bzip2-state-in-use-16 state) in-use-16) 456 | (setf (bzip2-state-i state) 0) 457 | (fill (bzip2-state-in-use state) nil) 458 | (transition-to bzip2-mapping-table2))) 459 | 460 | (bzip2-mapping-table2 (state) 461 | (declare (type bzip2-state state)) 462 | (loop with in-use-16 = (bzip2-state-in-use-16 state) 463 | with in-use-table = (bzip2-state-in-use state) 464 | while (< (bzip2-state-i state) 16) 465 | when (logbitp (bzip2-state-i state) in-use-16) 466 | do (let ((in-use (reverse-ub16 (ensure-and-read-bits 16 state)))) 467 | (dotimes (i 16) 468 | (setf (aref in-use-table (+ (* (bzip2-state-i state) 16) 469 | i)) 470 | (logbitp i in-use)))) 471 | do 472 | (incf (bzip2-state-i state))) 473 | (let ((n-in-use (make-maps state))) 474 | (when (zerop n-in-use) 475 | (error 'invalid-bzip2-data)) 476 | (setf (bzip2-state-alpha-size state) 477 | (+ n-in-use 2)) 478 | (transition-to bzip2-selector1))) 479 | 480 | (bzip2-selector1 (state) 481 | (declare (type bzip2-state state)) 482 | (let ((n-groups (ensure-and-read-bits 3 state))) 483 | (unless (<= 3 n-groups 6) 484 | (error 'invalid-bzip2-data)) 485 | (setf (bzip2-state-n-groups state) n-groups) 486 | (transition-to bzip2-selector2))) 487 | 488 | (bzip2-selector2 (state) 489 | (declare (type bzip2-state state)) 490 | (let ((n-selectors (ensure-and-read-bits 15 state))) 491 | (unless (plusp n-selectors) 492 | (error 'invalid-bzip2-data)) 493 | (setf (bzip2-state-n-selectors state) n-selectors) 494 | (setf (bzip2-state-i state) 0) 495 | (transition-to bzip2-selector3a))) 496 | 497 | (bzip2-selector3a (state) 498 | (declare (type bzip2-state state)) 499 | (setf (bzip2-state-j state) 0) 500 | (transition-to bzip2-selector3b)) 501 | 502 | (bzip2-selector3b (state) 503 | (declare (type bzip2-state state)) 504 | (loop 505 | do (let ((bit (ensure-and-read-bits 1 state))) 506 | (when (zerop bit) (loop-finish)) 507 | (when (>= (incf (bzip2-state-j state)) 508 | (bzip2-state-n-groups state)) 509 | (error 'invalid-bzip2-data))) 510 | finally 511 | (setf (aref (bzip2-state-selector-mtf state) 512 | (bzip2-state-i state)) 513 | (bzip2-state-j state))) 514 | (if (< (incf (bzip2-state-i state)) 515 | (bzip2-state-n-selectors state)) 516 | (transition-to bzip2-selector3a) 517 | (transition-to bzip2-selector-undo-mtf-values))) 518 | 519 | (bzip2-selector-undo-mtf-values (state) 520 | (declare (type bzip2-state state)) 521 | (let ((pos (make-array +bz-n-groups+ 522 | :element-type '(unsigned-byte 8))) 523 | (n-groups (bzip2-state-n-groups state)) 524 | (n-selectors (bzip2-state-n-selectors state)) 525 | (selector-table (bzip2-state-selector state)) 526 | (selector-mtf (bzip2-state-selector-mtf state))) 527 | (declare (dynamic-extent pos)) 528 | (dotimes (i n-groups) 529 | (setf (aref pos i) i)) 530 | (dotimes (i n-selectors) 531 | (let* ((v (aref selector-mtf i)) 532 | (tmp (aref pos v))) 533 | (loop until (zerop v) 534 | do (setf (aref pos v) (aref pos (1- v))) 535 | (decf v)) 536 | (setf (aref pos 0) tmp) 537 | (setf (aref selector-table i) tmp))) 538 | (setf (bzip2-state-j state) 0) 539 | (transition-to bzip2-coding-tables-groups-loop))) 540 | 541 | (bzip2-coding-tables-groups-loop (state) 542 | (declare (type bzip2-state state)) 543 | (cond 544 | ((< (bzip2-state-j state) (bzip2-state-n-groups state)) 545 | (setf (bzip2-state-curr state) (ensure-and-read-bits 5 state) 546 | (bzip2-state-i state) 0) 547 | (transition-to bzip2-coding-tables-alpha-loop)) 548 | (t 549 | (transition-to bzip2-create-huffman-decode-tables)))) 550 | 551 | (bzip2-coding-tables-alpha-loop (state) 552 | (declare (type bzip2-state state)) 553 | (unless (<= 1 (bzip2-state-curr state) 20) 554 | (error 'invalid-bzip2-data)) 555 | (let ((uc (ensure-and-read-bits 1 state))) 556 | (cond 557 | ((zerop uc) 558 | (setf (aref (bzip2-state-len state) (bzip2-state-j state) (bzip2-state-i state)) 559 | (bzip2-state-curr state)) 560 | (cond 561 | ((< (incf (bzip2-state-i state)) 562 | (bzip2-state-alpha-size state)) 563 | (bzip2-coding-tables-alpha-loop state)) 564 | (t 565 | (incf (bzip2-state-j state)) 566 | (transition-to bzip2-coding-tables-groups-loop)))) 567 | (t 568 | (transition-to bzip2-coding-tables-alpha-loop2))))) 569 | 570 | (bzip2-coding-tables-alpha-loop2 (state) 571 | (declare (type bzip2-state state)) 572 | (let ((uc (ensure-and-read-bits 1 state))) 573 | (if (zerop uc) 574 | (incf (bzip2-state-curr state)) 575 | (decf (bzip2-state-curr state))) 576 | (transition-to bzip2-coding-tables-alpha-loop))) 577 | 578 | (bzip2-create-huffman-decode-tables (state) 579 | (declare (type bzip2-state state)) 580 | (loop with n-groups = (bzip2-state-n-groups state) 581 | with len = (bzip2-state-len state) 582 | for x from 0 below n-groups 583 | do (loop with minLen = 32 584 | with maxLen = 0 585 | with alpha-size = (bzip2-state-alpha-size state) 586 | for y from 0 below alpha-size 587 | do (let ((xy (aref len x y))) 588 | (setf maxLen (max maxLen xy) 589 | minLen (min minLen xy))) 590 | finally 591 | (make-decode-tables state x minLen maxLen alpha-size) 592 | (setf (aref (bzip2-state-min-lengths state) x) minLen)) 593 | finally 594 | ;; We're not 'returning' anything here, we're just 595 | ;; forcing this call to be in tail position. 596 | (return (transition-to bzip2-initialize-mtf-values)))) 597 | 598 | (bzip2-initialize-mtf-values (state) 599 | (declare (type bzip2-state state)) 600 | (loop 601 | with kk = (1- +mtfa-size+) 602 | with mtfa = (bzip2-state-mtfa state) 603 | with mtfbase = (bzip2-state-mtfbase state) 604 | initially 605 | (setf (bzip2-state-EOB state) (1+ (bzip2-state-n-in-use state)) 606 | (bzip2-state-nblockMAX state) (* 100000 (bzip2-state-100k-block-size state)) 607 | (bzip2-state-group-number state) -1 608 | (bzip2-state-group-position state) 0) 609 | (fill (bzip2-state-unzftab state) 0) 610 | for i from (1- (floor 256 +mtfl-size+)) downto 0 611 | do (loop for j from (1- +mtfl-size+) downto 0 612 | do 613 | (setf (aref mtfa kk) (+ (* i +mtfl-size+) j)) 614 | (decf kk) 615 | finally 616 | (setf (aref mtfbase i) (1+ kk))) 617 | finally 618 | (setf (bzip2-state-nblock state) 0 619 | (bzip2-state-mtf-continuation state) #'bzip2-enter-mtf-decode-loop) 620 | ;; We're not 'returning' anything here, we're just 621 | ;; forcing this call to be in tail position. 622 | (return (transition-to bzip2-get-mtf-value)))) 623 | 624 | (bzip2-get-mtf-value (state) 625 | (declare (type bzip2-state state)) 626 | (when (zerop (bzip2-state-group-position state)) 627 | (when (>= (incf (bzip2-state-group-number state)) 628 | (bzip2-state-n-selectors state)) 629 | (error 'invalid-bzip2-data)) 630 | (let ((s (aref (bzip2-state-selector state) 631 | (bzip2-state-group-number state)))) 632 | (setf (bzip2-state-group-position state) +bz-g-size+ 633 | (bzip2-state-g-minlen state) (aref (bzip2-state-min-lengths state) s) 634 | (bzip2-state-g-limit state) (aref (bzip2-state-limit state) s) 635 | (bzip2-state-g-perm state) (aref (bzip2-state-perm state) s) 636 | (bzip2-state-g-base state) (aref (bzip2-state-base state) s)))) 637 | (decf (bzip2-state-group-position state)) 638 | (setf (bzip2-state-zn state) (bzip2-state-g-minlen state)) 639 | (transition-to bzip2-get-mtf-value1)) 640 | 641 | (bzip2-get-mtf-value1 (state) 642 | (declare (type bzip2-state state)) 643 | (let ((zvec (ensure-and-read-bits (bzip2-state-zn state) state))) 644 | (setf (bzip2-state-zvec state) zvec) 645 | (transition-to bzip2-get-mtf-value2))) 646 | 647 | (bzip2-get-mtf-value2 (state) 648 | (declare (type bzip2-state state)) 649 | (when (> (bzip2-state-zn state) 20) 650 | (error 'invalid-bzip2-data)) 651 | (cond 652 | ((<= (bzip2-state-zvec state) 653 | (aref (bzip2-state-g-limit state) 654 | (bzip2-state-zn state))) 655 | (transition-to bzip2-get-mtf-value-done)) 656 | (t 657 | (incf (bzip2-state-zn state)) 658 | (transition-to bzip2-get-mtf-value3)))) 659 | 660 | (bzip2-get-mtf-value3 (state) 661 | (declare (type bzip2-state state)) 662 | (let ((zj (ensure-and-read-bits 1 state))) 663 | (setf (bzip2-state-zvec state) 664 | (logior (ash (bzip2-state-zvec state) 1) zj)) 665 | (transition-to bzip2-get-mtf-value2))) 666 | 667 | (bzip2-get-mtf-value-done (state) 668 | (declare (type bzip2-state state)) 669 | (let* ((g-base (bzip2-state-g-base state)) 670 | (zn (bzip2-state-zn state)) 671 | (zvec (bzip2-state-zvec state)) 672 | (index (- zvec (aref g-base zn)))) 673 | (when (or (< index 0) (>= index +bz-max-alpha-size+)) 674 | (error 'invalid-bzip2-data)) 675 | (setf (bzip2-state-lval state) 676 | (aref (bzip2-state-g-perm state) index)) 677 | (let ((f (bzip2-state-mtf-continuation state))) 678 | (declare (type function f)) 679 | (setf (bzip2-state-state state) f) 680 | (funcall f state)))) 681 | 682 | (bzip2-enter-mtf-decode-loop (state) 683 | (declare (type bzip2-state state)) 684 | (let ((next-sym (bzip2-state-lval state))) 685 | (cond 686 | ((= next-sym (bzip2-state-EOB state)) 687 | (transition-to bzip2-prepare-cftab)) 688 | ((or (= next-sym +bz-runa+) (= next-sym +bz-runb+)) 689 | (setf (bzip2-state-es state) -1 690 | (bzip2-state-N state) 1) 691 | (transition-to bzip2-decode-rle-sequence)) 692 | (t 693 | (transition-to bzip2-runc))))) 694 | 695 | (bzip2-decode-rle-sequence (state) 696 | (declare (type bzip2-state state)) 697 | (let ((next-sym (bzip2-state-lval state))) 698 | (cond 699 | ((= next-sym +bz-runa+) 700 | (incf (bzip2-state-es state) (bzip2-state-N state))) 701 | ((= next-sym +bz-runb+) 702 | (incf (bzip2-state-es state) (* (bzip2-state-N state) 2)))) 703 | (setf (bzip2-state-N state) (* (bzip2-state-N state) 2)) 704 | (setf (bzip2-state-mtf-continuation state) #'bzip2-maybe-finish-rle-sequence) 705 | (transition-to bzip2-get-mtf-value))) 706 | 707 | (bzip2-maybe-finish-rle-sequence (state) 708 | (declare (type bzip2-state state)) 709 | (let ((next-sym (bzip2-state-lval state))) 710 | (if (or (= next-sym +bz-runa+) (= next-sym +bz-runb+)) 711 | (transition-to bzip2-decode-rle-sequence) 712 | (transition-to bzip2-finish-rle-sequence)))) 713 | 714 | (bzip2-finish-rle-sequence (state) 715 | (declare (type bzip2-state state)) 716 | (let ((uc (aref (bzip2-state-seq-to-unseq state) 717 | (aref (bzip2-state-mtfa state) 718 | (aref (bzip2-state-mtfbase state) 0))))) 719 | (incf (aref (bzip2-state-unzftab state) uc) 720 | (incf (bzip2-state-es state))) 721 | (if (bzip2-state-small-decompression-p state) 722 | (error 'bzip2-small-decompression-unimplemented) 723 | (loop with nblock = (bzip2-state-nblock state) 724 | with nblockMAX = (bzip2-state-nblockMAX state) 725 | with tt = (bzip2-state-tt state) 726 | repeat (bzip2-state-es state) 727 | do 728 | (when (>= nblock nblockMAX) 729 | (error 'invalid-bzip2-data)) 730 | (setf (aref tt nblock) uc) 731 | (incf nblock) 732 | finally 733 | (setf (bzip2-state-nblock state) nblock) 734 | ;; We're not 'returning' anything here, we're 735 | ;; just forcing this call to be in tail 736 | ;; position. 737 | (return (transition-to bzip2-enter-mtf-decode-loop)))))) 738 | 739 | (bzip2-runc (state) 740 | (declare (type bzip2-state state)) 741 | (let ((next-sym (bzip2-state-lval state)) 742 | (uc 0)) 743 | (when (>= (bzip2-state-nblock state) 744 | (bzip2-state-nblockMAX state)) 745 | (error 'invalid-bzip2-data)) 746 | (let ((mtfbase (bzip2-state-mtfbase state)) 747 | (mtfa (bzip2-state-mtfa state)) 748 | (nn (1- next-sym))) 749 | (cond 750 | ((< nn +mtfl-size+) 751 | ;; "avoid general-case expense" 752 | (let ((pp (aref mtfbase 0))) 753 | (setf uc (aref mtfa (+ pp nn))) 754 | (replace mtfa mtfa :start1 (1+ pp) :end1 (+ pp nn 1) 755 | :start2 pp :end2 (+ pp nn)) 756 | (setf (aref mtfa pp) uc))) 757 | (t 758 | ;; "general case" 759 | (let* ((lno (truncate nn +mtfl-size+)) 760 | (off (rem nn +mtfl-size+)) 761 | (pp (+ (aref mtfbase lno) off))) 762 | (setf uc (aref mtfa pp)) 763 | (loop while (> pp (aref mtfbase lno)) 764 | do (setf (aref mtfa pp) (aref mtfa (1- pp))) 765 | (decf pp)) 766 | (incf (aref mtfbase lno)) 767 | (loop for x from lno above 0 768 | do 769 | (setf (aref mtfa (decf (aref mtfbase x))) 770 | (aref mtfa (+ (aref mtfbase (1- x)) (1- +mtfl-size+))))) 771 | (setf (aref mtfa (decf (aref mtfbase 0))) uc) 772 | (when (zerop (aref mtfbase 0)) 773 | (loop with kk = (1- +mtfa-size+) 774 | for ii from (1- (floor 256 +mtfl-size+)) downto 0 775 | do (loop for jj from (1- +mtfl-size+) downto 0 776 | do (setf (aref mtfa kk) 777 | (aref mtfa (+ (aref mtfbase ii) jj))) 778 | (decf kk)) 779 | (setf (aref mtfbase ii) (1+ kk))))))) 780 | (incf (aref (bzip2-state-unzftab state) 781 | (aref (bzip2-state-seq-to-unseq state) uc))) 782 | (if (bzip2-state-small-decompression-p state) 783 | (error 'bzip2-small-decompression-unimplemented) 784 | (setf (aref (bzip2-state-tt state) (bzip2-state-nblock state)) 785 | (aref (bzip2-state-seq-to-unseq state) uc))) 786 | (incf (bzip2-state-nblock state)) 787 | (setf (bzip2-state-mtf-continuation state) #'bzip2-enter-mtf-decode-loop) 788 | (transition-to bzip2-get-mtf-value)))) 789 | 790 | (bzip2-prepare-cftab (state) 791 | (declare (type bzip2-state state)) 792 | (when (or (minusp (bzip2-state-original-pointer state)) 793 | (>= (bzip2-state-original-pointer state) 794 | (bzip2-state-nblock state))) 795 | (error 'invalid-bzip2-data)) 796 | (let ((cftab (bzip2-state-cftab state)) 797 | (unzftab (bzip2-state-unzftab state))) 798 | (setf (aref cftab 0) 0) 799 | (replace cftab unzftab :start1 1 :end1 257 :start2 0 :end2 256) 800 | (loop for i from 1 to 256 801 | do (incf (aref cftab i) (aref cftab (1- i)))) 802 | (loop with nblock = (bzip2-state-nblock state) 803 | for i from 0 to 256 804 | unless (<= 0 (aref cftab i) nblock) 805 | do (error 'invalid-bzip2-data)) 806 | (setf (bzip2-state-out-len state) 0 807 | (bzip2-state-out-ch state) 0 808 | (bzip2-state-calculated-block-crc state) #xffffffff) 809 | (loop with nblock = (bzip2-state-nblock state) 810 | with tt = (bzip2-state-tt state) 811 | for i from 0 below nblock 812 | do (let ((uc (logand (aref tt i) #xff))) 813 | (setf (aref tt (aref cftab uc)) 814 | (logior (aref tt (aref cftab uc)) (ash i 8))) 815 | (incf (aref cftab uc))) 816 | finally 817 | (setf (bzip2-state-t-position state) 818 | (ash (aref tt (bzip2-state-original-pointer state)) -8)) 819 | (setf (bzip2-state-n-blocks-used state) 0) 820 | (cond 821 | ((bzip2-state-block-randomized-p state) 822 | (error 'bzip2-randomized-blocks-unimplemented)) 823 | (t 824 | ;; BZIP2-STATE-T-POSITION was sometimes set to 825 | ;; a value outside its declared domain. Now 826 | ;; TEMP is used to store this value instead. 827 | (let ((temp (aref tt (bzip2-state-t-position state)))) 828 | (setf (bzip2-state-k0 state) (logand #xff temp) 829 | (bzip2-state-t-position state) (ash temp -8))) 830 | (incf (bzip2-state-n-blocks-used state)))) 831 | ;; We're not 'returning' anything here, we're just 832 | ;; forcing this call to be in tail position. 833 | (return (transition-to bzip2-output))))) 834 | 835 | (bzip2-output (state) 836 | (declare (type bzip2-state state)) 837 | (let ((corruptp (undo-rle-obuf-to-output state))) 838 | (when corruptp 839 | (error 'invalid-bzip2-data)) 840 | (unless (and (= (bzip2-state-n-blocks-used state) 841 | (1+ (bzip2-state-nblock state))) 842 | (zerop (bzip2-state-out-len state))) 843 | (throw 'bzip2-done :ok)) 844 | (let ((stored (bzip2-state-stored-block-crc state)) 845 | (calculated (bzip2-state-calculated-block-crc state))) 846 | (setf calculated (logand #xffffffff (lognot calculated))) 847 | (setf (bzip2-state-calculated-block-crc state) calculated) 848 | (unless (= calculated stored) 849 | (error 'checksum-mismatch 850 | :stored stored 851 | :computed calculated 852 | :kind :crc32)) 853 | (setf (bzip2-state-calculated-combined-crc state) 854 | (logand #xffffffff 855 | (logior (ash (bzip2-state-calculated-combined-crc state) 1) 856 | (ash (bzip2-state-calculated-combined-crc state) -31)))) 857 | (setf (bzip2-state-calculated-combined-crc state) 858 | (logand #xffffffff 859 | (logxor (bzip2-state-calculated-combined-crc state) 860 | calculated))) 861 | (transition-to bzip2-block-header1)))) 862 | 863 | (bzip2-end-header2 (state) 864 | (declare (type bzip2-state state)) 865 | (let ((byte (ensure-and-read-bits 8 state))) 866 | (if (= byte #x72) 867 | (transition-to bzip2-end-header3) 868 | (error 'invalid-bzip2-data)))) 869 | 870 | (bzip2-end-header3 (state) 871 | (declare (type bzip2-state state)) 872 | (let ((byte (ensure-and-read-bits 8 state))) 873 | (if (= byte #x45) 874 | (transition-to bzip2-end-header4) 875 | (error 'invalid-bzip2-data)))) 876 | 877 | (bzip2-end-header4 (state) 878 | (declare (type bzip2-state state)) 879 | (let ((byte (ensure-and-read-bits 8 state))) 880 | (if (= byte #x38) 881 | (transition-to bzip2-end-header5) 882 | (error 'invalid-bzip2-data)))) 883 | 884 | (bzip2-end-header5 (state) 885 | (declare (type bzip2-state state)) 886 | (let ((byte (ensure-and-read-bits 8 state))) 887 | (if (= byte #x50) 888 | (transition-to bzip2-end-header6) 889 | (error 'invalid-bzip2-data)))) 890 | 891 | (bzip2-end-header6 (state) 892 | (declare (type bzip2-state state)) 893 | (let ((byte (ensure-and-read-bits 8 state))) 894 | (unless (= byte #x90) 895 | (error 'invalid-bzip2-data)) 896 | (setf (bzip2-state-stored-combined-crc state) 0) 897 | (transition-to bzip2-stored-combined-crc32-1))) 898 | 899 | (bzip2-stored-combined-crc32-1 (state) 900 | (declare (type bzip2-state state)) 901 | (setf (bzip2-state-stored-combined-crc state) 902 | (ensure-and-read-bits 8 state)) 903 | (transition-to bzip2-stored-combined-crc32-2)) 904 | 905 | (bzip2-stored-combined-crc32-2 (state) 906 | (declare (type bzip2-state state)) 907 | (let ((byte (ensure-and-read-bits 8 state))) 908 | (setf (bzip2-state-stored-combined-crc state) 909 | (logand #xffffffff 910 | (logior (ash (bzip2-state-stored-combined-crc state) 8) 911 | byte))) 912 | (transition-to bzip2-stored-combined-crc32-3))) 913 | 914 | (bzip2-stored-combined-crc32-3 (state) 915 | (declare (type bzip2-state state)) 916 | (let ((byte (ensure-and-read-bits 8 state))) 917 | (setf (bzip2-state-stored-combined-crc state) 918 | (logand #xffffffff 919 | (logior (ash (bzip2-state-stored-combined-crc state) 8) 920 | byte))) 921 | (transition-to bzip2-stored-combined-crc32-4))) 922 | 923 | (bzip2-stored-combined-crc32-4 (state) 924 | (declare (type bzip2-state state)) 925 | (let ((byte (ensure-and-read-bits 8 state))) 926 | (setf (bzip2-state-stored-combined-crc state) 927 | (logand #xffffffff 928 | (logior (ash (bzip2-state-stored-combined-crc state) 8) 929 | byte))) 930 | (unless (= (bzip2-state-stored-combined-crc state) 931 | (bzip2-state-calculated-combined-crc state)) 932 | (error 'checksum-mismatch 933 | :stored (bzip2-state-stored-combined-crc state) 934 | :computed (bzip2-state-calculated-combined-crc state) 935 | :kind :crc32)) 936 | (setf (bzip2-state-done state) t) 937 | (transition-to bzip2-done))) 938 | 939 | (bzip2-done (state) 940 | (declare (ignore state)) 941 | (throw 'bzip2-done t)) 942 | ) 943 | (unless (bzip2-state-state state) 944 | (setf (bzip2-state-state state) #'bzip2-header)) 945 | (funcall (the function (bzip2-state-state state)) state)))) 946 | 947 | (defun %bzip2-decompress (state input output &key (input-start 0) input-end 948 | (output-start 0) output-end) 949 | (declare (type bzip2-state state)) 950 | (let* ((input-end (or input-end (length input))) 951 | (output-end (or output-end (length output)))) 952 | (setf (bzip2-state-input state) input 953 | (bzip2-state-input-start state) input-start 954 | (bzip2-state-input-index state) input-start 955 | (bzip2-state-input-end state) input-end 956 | (bzip2-state-output state) output 957 | (bzip2-state-output-start state) output-start 958 | (bzip2-state-output-index state) output-start 959 | (bzip2-state-output-end state) output-end) 960 | (catch 'bzip2-done 961 | (%bzip2-state-machine state)) 962 | (values (- (bzip2-state-input-index state) input-start) 963 | (- (bzip2-state-output-index state) output-start)))) 964 | 965 | (defun make-bzip2-state () 966 | (let ((state (%make-bzip2-state))) 967 | (setf (dstate-checksum state) (make-crc32) 968 | (dstate-update-checksum state) #'update-crc32) 969 | state)) 970 | -------------------------------------------------------------------------------- /chipz.asd: -------------------------------------------------------------------------------- 1 | ; -*- mode: lisp -*- 2 | (cl:defpackage :chipz-system 3 | (:use :cl :asdf) 4 | (:export #:gray-streams)) 5 | (cl:in-package :chipz-system) 6 | 7 | (defclass txt-file (doc-file) ((type :initform "txt"))) 8 | (defclass css-file (doc-file) ((type :initform "css"))) 9 | 10 | (eval-when (:compile-toplevel :load-toplevel :execute) 11 | #+(or sbcl lispworks openmcl cmu allegro clisp ecl) 12 | (pushnew 'chipz-system:gray-streams cl:*features*)) 13 | 14 | (asdf:defsystem :chipz 15 | :version "0.8" 16 | :author "Nathan Froyd " 17 | :maintainer "Nathan Froyd " 18 | :description "A library for decompressing deflate, zlib, and gzip data" 19 | :license "BSD style" 20 | :components ((:static-file "NEWS") 21 | (:static-file "LICENSE") 22 | (:static-file "TODO") 23 | (:file "package") 24 | (:module "doc" 25 | :components 26 | ((:html-file "index") 27 | (:txt-file "chipz-doc") 28 | (:css-file "style"))) 29 | (:file "constants" :depends-on ("package")) 30 | (:file "types-and-tables" :depends-on ("constants")) 31 | (:file "crc32" :depends-on ("types-and-tables")) 32 | (:file "adler32" :depends-on ("types-and-tables")) 33 | (:file "conditions" :depends-on ("package")) 34 | (:file "dstate" :depends-on ("package")) 35 | (:file "inflate-state" :depends-on ("dstate" "crc32" "adler32")) 36 | (:file "gzip" :depends-on ("inflate-state" "conditions")) 37 | (:file "zlib" :depends-on ("inflate-state" "conditions")) 38 | (:file "inflate" :depends-on ("inflate-state" 39 | "gzip" "zlib" 40 | "conditions")) 41 | (:file "bzip2" :depends-on ("dstate" "constants")) 42 | (:file "decompress" :depends-on ("inflate-state" 43 | "inflate" "bzip2")) 44 | #+chipz-system:gray-streams 45 | (:file "stream" :depends-on ("inflate-state" "inflate")) 46 | 47 | #-chipz-system:gray-streams 48 | (:file "stream-fallback" :depends-on ("package")))) 49 | -------------------------------------------------------------------------------- /conditions.lisp: -------------------------------------------------------------------------------- 1 | ;;;; conditions.lisp -- errors that can be thrown by chipz 2 | 3 | (in-package :chipz) 4 | 5 | (define-condition chipz-error (simple-error) 6 | () 7 | (:documentation "The base condition of the CHIPZ library. All 8 | other conditions inherit from this error.")) 9 | 10 | (define-condition decompression-error (chipz-error) 11 | () 12 | (:documentation "The base condition of all conditions signaled during 13 | decompression.")) 14 | 15 | (define-condition invalid-format-error (chipz-error) 16 | ((format :initarg :format :reader invalid-format)) 17 | (:report (lambda (condition stream) 18 | (format stream "Invalid format ~S" 19 | (invalid-format condition)))) 20 | (:documentation "Signaled when an invalid format name is passed to 21 | MAKE-DSTATE, MAKE-INFLATE-STATE, or DECOMPRESS.")) 22 | 23 | (define-condition invalid-checksum-error (decompression-error) 24 | ((expected-checksum :initarg :stored :reader expected-checksum) 25 | (actual-checksum :initarg :computed :reader actual-checksum) 26 | (kind :initarg :kind :reader checksum-kind)) 27 | (:report (lambda (condition stream) 28 | (format stream "Invalid ~A checksum, expected ~X, got ~X" 29 | (checksum-kind condition) 30 | (expected-checksum condition) 31 | (actual-checksum condition)))) 32 | (:documentation "Signaled when the checksum of decompressed data does 33 | not match the expected value.")) 34 | 35 | (define-condition premature-end-of-stream (decompression-error) 36 | () 37 | (:report (lambda (condition stream) 38 | (declare (ignore condition)) 39 | (format stream "Unexpected EOF"))) 40 | (:documentation "Signaled when FINISH-DSTATE is called on a state that 41 | has not actually reached the end of the input being decompressed.")) 42 | 43 | 44 | ;;; Errors related to inflate 45 | 46 | (define-condition inflate-error (decompression-error) 47 | () 48 | (:documentation "The base condition of conditions signaled when 49 | decompressing DEFLATE-related formats.")) 50 | 51 | (define-condition invalid-zlib-header-error (inflate-error) 52 | () 53 | (:report (lambda (condition stream) 54 | (declare (ignore condition)) 55 | (format stream "Invalid zlib header"))) 56 | (:documentation "Signaled when a zlib header does not pass the 57 | consistency check.")) 58 | 59 | (define-condition invalid-gzip-header-error (inflate-error) 60 | () 61 | (:report (lambda (condition stream) 62 | (declare (ignore condition)) 63 | (format stream "Invalid gzip header"))) 64 | (:documentation "Signaled when a gzip header does not have the proper ID.")) 65 | 66 | (define-condition reserved-block-type-error (inflate-error) 67 | () 68 | (:report (lambda (condition stream) 69 | (declare (ignore condition)) 70 | (format stream "Invalid deflate block"))) 71 | (:documentation "Signaled when an invalid deflate block is found.")) 72 | 73 | (define-condition invalid-stored-block-length-error (inflate-error) 74 | () 75 | (:report (lambda (condition stream) 76 | (declare (ignore condition)) 77 | (format stream "Invalid stored block length"))) 78 | (:documentation "Signaled when a stored block's length does not pass 79 | the consistency check.")) 80 | 81 | 82 | ;;; Errors related to bzip2 83 | 84 | (define-condition bzip2-error (decompression-error) 85 | () 86 | (:documentation "The base condition of conditions signaled when 87 | decompressing BZIP2-related formats.")) 88 | 89 | (define-condition invalid-bzip2-data (bzip2-error) 90 | () 91 | (:documentation "Signaled when invalid bzip2 data is found.")) 92 | -------------------------------------------------------------------------------- /constants.lisp: -------------------------------------------------------------------------------- 1 | (in-package :chipz) 2 | 3 | (defmacro define-constant (name value) 4 | `(unless (boundp ',name) 5 | (defconstant ,name ,value))) 6 | 7 | 8 | ;;;; DEFLATE constants. 9 | 10 | ;;; block types 11 | (define-constant +block-no-compress+ 0) 12 | (define-constant +block-fixed-codes+ 1) 13 | (define-constant +block-dynamic-codes+ 2) 14 | (define-constant +block-invalid+ 3) 15 | 16 | (define-constant +max-code-length+ 16) 17 | (define-constant +max-codes+ 288) 18 | (define-constant +max-n-code-lengths+ 19) 19 | (define-constant +deflate-max-bits+ 15) 20 | 21 | (define-constant +length-code-extra-bits+ 22 | (coerce #(0 0 0 0 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5 5 0) 23 | '(vector (unsigned-byte 16)))) 24 | 25 | (define-constant +length-code-base-lengths+ 26 | (coerce #(3 4 5 6 7 8 9 10 11 13 15 17 19 23 27 27 | 31 35 43 51 59 67 83 99 115 131 163 195 227 258) 28 | '(vector (unsigned-byte 16)))) 29 | 30 | 31 | ;;;; BZIP constants. 32 | 33 | (defconstant +bz-header-b+ #x42) 34 | (defconstant +bz-header-z+ #x5a) 35 | (defconstant +bz-header-h+ #x68) 36 | (defconstant +bz-header-0+ #x30) 37 | (defconstant +100k+ (expt 10 5)) 38 | 39 | (defconstant +mtfa-size+ 4096) 40 | (defconstant +mtfl-size+ 16) 41 | (defconstant +bz-max-alpha-size+ 258) 42 | (defconstant +bz-max-code-len+ 23) 43 | (defconstant +bz-runa+ 0) 44 | (defconstant +bz-runb+ 1) 45 | (defconstant +bz-n-groups+ 6) 46 | (defconstant +bz-g-size+ 50) 47 | (defconstant +bz-n-iters+ 4) 48 | (defconstant +bz-max-selectors+ (+ 2 (/ (* 9 +100k+) +bz-g-size+))) 49 | 50 | 51 | ;;; miscellaneous 52 | 53 | ;;; for DECOMPRESS. 54 | (defconstant +default-buffer-size+ 8192) 55 | 56 | ;;; CRC32 57 | (declaim (type (simple-array (unsigned-byte 32) (256)) +crc32-table+ +bzip2-crc32-table+)) 58 | (define-constant +crc32-table+ 59 | (coerce '(#x00000000 #x77073096 #xEE0E612C #x990951BA #x076DC419 #x706AF48F 60 | #xE963A535 #x9E6495A3 #x0EDB8832 #x79DCB8A4 #xE0D5E91E #x97D2D988 61 | #x09B64C2B #x7EB17CBD #xE7B82D07 #x90BF1D91 #x1DB71064 #x6AB020F2 62 | #xF3B97148 #x84BE41DE #x1ADAD47D #x6DDDE4EB #xF4D4B551 #x83D385C7 63 | #x136C9856 #x646BA8C0 #xFD62F97A #x8A65C9EC #x14015C4F #x63066CD9 64 | #xFA0F3D63 #x8D080DF5 #x3B6E20C8 #x4C69105E #xD56041E4 #xA2677172 65 | #x3C03E4D1 #x4B04D447 #xD20D85FD #xA50AB56B #x35B5A8FA #x42B2986C 66 | #xDBBBC9D6 #xACBCF940 #x32D86CE3 #x45DF5C75 #xDCD60DCF #xABD13D59 67 | #x26D930AC #x51DE003A #xC8D75180 #xBFD06116 #x21B4F4B5 #x56B3C423 68 | #xCFBA9599 #xB8BDA50F #x2802B89E #x5F058808 #xC60CD9B2 #xB10BE924 69 | #x2F6F7C87 #x58684C11 #xC1611DAB #xB6662D3D #x76DC4190 #x01DB7106 70 | #x98D220BC #xEFD5102A #x71B18589 #x06B6B51F #x9FBFE4A5 #xE8B8D433 71 | #x7807C9A2 #x0F00F934 #x9609A88E #xE10E9818 #x7F6A0DBB #x086D3D2D 72 | #x91646C97 #xE6635C01 #x6B6B51F4 #x1C6C6162 #x856530D8 #xF262004E 73 | #x6C0695ED #x1B01A57B #x8208F4C1 #xF50FC457 #x65B0D9C6 #x12B7E950 74 | #x8BBEB8EA #xFCB9887C #x62DD1DDF #x15DA2D49 #x8CD37CF3 #xFBD44C65 75 | #x4DB26158 #x3AB551CE #xA3BC0074 #xD4BB30E2 #x4ADFA541 #x3DD895D7 76 | #xA4D1C46D #xD3D6F4FB #x4369E96A #x346ED9FC #xAD678846 #xDA60B8D0 77 | #x44042D73 #x33031DE5 #xAA0A4C5F #xDD0D7CC9 #x5005713C #x270241AA 78 | #xBE0B1010 #xC90C2086 #x5768B525 #x206F85B3 #xB966D409 #xCE61E49F 79 | #x5EDEF90E #x29D9C998 #xB0D09822 #xC7D7A8B4 #x59B33D17 #x2EB40D81 80 | #xB7BD5C3B #xC0BA6CAD #xEDB88320 #x9ABFB3B6 #x03B6E20C #x74B1D29A 81 | #xEAD54739 #x9DD277AF #x04DB2615 #x73DC1683 #xE3630B12 #x94643B84 82 | #x0D6D6A3E #x7A6A5AA8 #xE40ECF0B #x9309FF9D #x0A00AE27 #x7D079EB1 83 | #xF00F9344 #x8708A3D2 #x1E01F268 #x6906C2FE #xF762575D #x806567CB 84 | #x196C3671 #x6E6B06E7 #xFED41B76 #x89D32BE0 #x10DA7A5A #x67DD4ACC 85 | #xF9B9DF6F #x8EBEEFF9 #x17B7BE43 #x60B08ED5 #xD6D6A3E8 #xA1D1937E 86 | #x38D8C2C4 #x4FDFF252 #xD1BB67F1 #xA6BC5767 #x3FB506DD #x48B2364B 87 | #xD80D2BDA #xAF0A1B4C #x36034AF6 #x41047A60 #xDF60EFC3 #xA867DF55 88 | #x316E8EEF #x4669BE79 #xCB61B38C #xBC66831A #x256FD2A0 #x5268E236 89 | #xCC0C7795 #xBB0B4703 #x220216B9 #x5505262F #xC5BA3BBE #xB2BD0B28 90 | #x2BB45A92 #x5CB36A04 #xC2D7FFA7 #xB5D0CF31 #x2CD99E8B #x5BDEAE1D 91 | #x9B64C2B0 #xEC63F226 #x756AA39C #x026D930A #x9C0906A9 #xEB0E363F 92 | #x72076785 #x05005713 #x95BF4A82 #xE2B87A14 #x7BB12BAE #x0CB61B38 93 | #x92D28E9B #xE5D5BE0D #x7CDCEFB7 #x0BDBDF21 #x86D3D2D4 #xF1D4E242 94 | #x68DDB3F8 #x1FDA836E #x81BE16CD #xF6B9265B #x6FB077E1 #x18B74777 95 | #x88085AE6 #xFF0F6A70 #x66063BCA #x11010B5C #x8F659EFF #xF862AE69 96 | #x616BFFD3 #x166CCF45 #xA00AE278 #xD70DD2EE #x4E048354 #x3903B3C2 97 | #xA7672661 #xD06016F7 #x4969474D #x3E6E77DB #xAED16A4A #xD9D65ADC 98 | #x40DF0B66 #x37D83BF0 #xA9BCAE53 #xDEBB9EC5 #x47B2CF7F #x30B5FFE9 99 | #xBDBDF21C #xCABAC28A #x53B39330 #x24B4A3A6 #xBAD03605 #xCDD70693 100 | #x54DE5729 #x23D967BF #xB3667A2E #xC4614AB8 #x5D681B02 #x2A6F2B94 101 | #xB40BBE37 #xC30C8EA1 #x5A05DF1B #x2D02EF8D) 102 | '(vector (unsigned-byte 32)))) 103 | 104 | (define-constant +bzip2-crc32-table+ 105 | (coerce '(#x00000000 #x04c11db7 #x09823b6e #x0d4326d9 106 | #x130476dc #x17c56b6b #x1a864db2 #x1e475005 107 | #x2608edb8 #x22c9f00f #x2f8ad6d6 #x2b4bcb61 108 | #x350c9b64 #x31cd86d3 #x3c8ea00a #x384fbdbd 109 | #x4c11db70 #x48d0c6c7 #x4593e01e #x4152fda9 110 | #x5f15adac #x5bd4b01b #x569796c2 #x52568b75 111 | #x6a1936c8 #x6ed82b7f #x639b0da6 #x675a1011 112 | #x791d4014 #x7ddc5da3 #x709f7b7a #x745e66cd 113 | #x9823b6e0 #x9ce2ab57 #x91a18d8e #x95609039 114 | #x8b27c03c #x8fe6dd8b #x82a5fb52 #x8664e6e5 115 | #xbe2b5b58 #xbaea46ef #xb7a96036 #xb3687d81 116 | #xad2f2d84 #xa9ee3033 #xa4ad16ea #xa06c0b5d 117 | #xd4326d90 #xd0f37027 #xddb056fe #xd9714b49 118 | #xc7361b4c #xc3f706fb #xceb42022 #xca753d95 119 | #xf23a8028 #xf6fb9d9f #xfbb8bb46 #xff79a6f1 120 | #xe13ef6f4 #xe5ffeb43 #xe8bccd9a #xec7dd02d 121 | #x34867077 #x30476dc0 #x3d044b19 #x39c556ae 122 | #x278206ab #x23431b1c #x2e003dc5 #x2ac12072 123 | #x128e9dcf #x164f8078 #x1b0ca6a1 #x1fcdbb16 124 | #x018aeb13 #x054bf6a4 #x0808d07d #x0cc9cdca 125 | #x7897ab07 #x7c56b6b0 #x71159069 #x75d48dde 126 | #x6b93dddb #x6f52c06c #x6211e6b5 #x66d0fb02 127 | #x5e9f46bf #x5a5e5b08 #x571d7dd1 #x53dc6066 128 | #x4d9b3063 #x495a2dd4 #x44190b0d #x40d816ba 129 | #xaca5c697 #xa864db20 #xa527fdf9 #xa1e6e04e 130 | #xbfa1b04b #xbb60adfc #xb6238b25 #xb2e29692 131 | #x8aad2b2f #x8e6c3698 #x832f1041 #x87ee0df6 132 | #x99a95df3 #x9d684044 #x902b669d #x94ea7b2a 133 | #xe0b41de7 #xe4750050 #xe9362689 #xedf73b3e 134 | #xf3b06b3b #xf771768c #xfa325055 #xfef34de2 135 | #xc6bcf05f #xc27dede8 #xcf3ecb31 #xcbffd686 136 | #xd5b88683 #xd1799b34 #xdc3abded #xd8fba05a 137 | #x690ce0ee #x6dcdfd59 #x608edb80 #x644fc637 138 | #x7a089632 #x7ec98b85 #x738aad5c #x774bb0eb 139 | #x4f040d56 #x4bc510e1 #x46863638 #x42472b8f 140 | #x5c007b8a #x58c1663d #x558240e4 #x51435d53 141 | #x251d3b9e #x21dc2629 #x2c9f00f0 #x285e1d47 142 | #x36194d42 #x32d850f5 #x3f9b762c #x3b5a6b9b 143 | #x0315d626 #x07d4cb91 #x0a97ed48 #x0e56f0ff 144 | #x1011a0fa #x14d0bd4d #x19939b94 #x1d528623 145 | #xf12f560e #xf5ee4bb9 #xf8ad6d60 #xfc6c70d7 146 | #xe22b20d2 #xe6ea3d65 #xeba91bbc #xef68060b 147 | #xd727bbb6 #xd3e6a601 #xdea580d8 #xda649d6f 148 | #xc423cd6a #xc0e2d0dd #xcda1f604 #xc960ebb3 149 | #xbd3e8d7e #xb9ff90c9 #xb4bcb610 #xb07daba7 150 | #xae3afba2 #xaafbe615 #xa7b8c0cc #xa379dd7b 151 | #x9b3660c6 #x9ff77d71 #x92b45ba8 #x9675461f 152 | #x8832161a #x8cf30bad #x81b02d74 #x857130c3 153 | #x5d8a9099 #x594b8d2e #x5408abf7 #x50c9b640 154 | #x4e8ee645 #x4a4ffbf2 #x470cdd2b #x43cdc09c 155 | #x7b827d21 #x7f436096 #x7200464f #x76c15bf8 156 | #x68860bfd #x6c47164a #x61043093 #x65c52d24 157 | #x119b4be9 #x155a565e #x18197087 #x1cd86d30 158 | #x029f3d35 #x065e2082 #x0b1d065b #x0fdc1bec 159 | #x3793a651 #x3352bbe6 #x3e119d3f #x3ad08088 160 | #x2497d08d #x2056cd3a #x2d15ebe3 #x29d4f654 161 | #xc5a92679 #xc1683bce #xcc2b1d17 #xc8ea00a0 162 | #xd6ad50a5 #xd26c4d12 #xdf2f6bcb #xdbee767c 163 | #xe3a1cbc1 #xe760d676 #xea23f0af #xeee2ed18 164 | #xf0a5bd1d #xf464a0aa #xf9278673 #xfde69bc4 165 | #x89b8fd09 #x8d79e0be #x803ac667 #x84fbdbd0 166 | #x9abc8bd5 #x9e7d9662 #x933eb0bb #x97ffad0c 167 | #xafb010b1 #xab710d06 #xa6322bdf #xa2f33668 168 | #xbcb4666d #xb8757bda #xb5365d03 #xb1f740b4) 169 | '(vector (unsigned-byte 32)))) 170 | 171 | ;;; Adler32, smallest prime < 65536 172 | (defconstant adler32-modulo 65521) 173 | -------------------------------------------------------------------------------- /crc32.lisp: -------------------------------------------------------------------------------- 1 | ;;;; crc32.lisp -- implementation of the CRC32 checksum 2 | 3 | (in-package :chipz) 4 | 5 | #+sbcl 6 | (progn 7 | (defstruct (crc32 8 | (:copier copy-crc32)) 9 | (crc #xffffffff :type (unsigned-byte 32))) 10 | 11 | (defun update-crc32 (state vector start end) 12 | (declare (type simple-octet-vector vector)) 13 | (declare (type index start end)) 14 | (do ((crc (crc32-crc state)) 15 | (i start (1+ i)) 16 | (table +crc32-table+)) 17 | ((>= i end) 18 | (setf (crc32-crc state) crc) 19 | state) 20 | (declare (type (unsigned-byte 32) crc)) 21 | (setf crc (logxor (aref table 22 | (logand (logxor crc (aref vector i)) #xff)) 23 | (ash crc -8))))) 24 | 25 | (defun produce-crc32 (state) 26 | (logxor #xffffffff (crc32-crc state))) 27 | ) 28 | 29 | ;; An implementation that conses significantly less on most 30 | ;; implementations. Credit to Zach Beane. 31 | #-sbcl 32 | (progn 33 | (defstruct (crc32 34 | (:copier copy-crc32)) 35 | (low #xffff) 36 | (high #xffff)) 37 | 38 | (defun crc32-table () 39 | (let ((table (make-array 512 :element-type '(unsigned-byte 16)))) 40 | (dotimes (n 256 table) 41 | (let ((c n)) 42 | (declare (type (unsigned-byte 32) c)) 43 | (dotimes (k 8) 44 | (if (logbitp 0 c) 45 | (setf c (logxor #xEDB88320 (ash c -1))) 46 | (setf c (ash c -1))) 47 | (setf (aref table (ash n 1)) (ldb (byte 16 16) c) 48 | (aref table (1+ (ash n 1))) (ldb (byte 16 0) c))))))) 49 | 50 | (defvar *crc32-table* (crc32-table)) 51 | 52 | (defun crc32 (high low buf start count) 53 | (declare (type (unsigned-byte 16) high low) 54 | (type index start count) 55 | (type simple-octet-vector buf) 56 | (optimize speed)) 57 | (let ((i start) 58 | (table *crc32-table*)) 59 | (declare (type index i) 60 | (type (simple-array (unsigned-byte 16) (*)) table)) 61 | (dotimes (j count (values high low)) 62 | (let ((index (logxor (logand low #xFF) (aref buf i)))) 63 | (declare (type (integer 0 255) index)) 64 | (let ((high-index (ash index 1)) 65 | (low-index (1+ (ash index 1)))) 66 | (declare (type (integer 0 511) high-index low-index)) 67 | (let ((t-high (aref table high-index)) 68 | (t-low (aref table low-index))) 69 | (declare (type (unsigned-byte 16) t-high t-low)) 70 | (incf i) 71 | (setf low (logxor (ash (logand high #xFF) 8) 72 | (ash low -8) 73 | t-low)) 74 | (setf high (logxor (ash high -8) t-high)))))))) 75 | 76 | (defun update-crc32 (state vector start end) 77 | ;; ABCL used to miscompile (SETF (VALUES (ACCESSOR ...) ...) ...) 78 | ;; in case you were wondering why we take this route. 79 | (multiple-value-bind (high low) (crc32 (crc32-high state) (crc32-low state) 80 | vector start (- end start)) 81 | (setf (crc32-high state) high 82 | (crc32-low state) low) 83 | (values high low))) 84 | 85 | (defun produce-crc32 (state) 86 | (+ (ash (logxor (crc32-high state) #xFFFF) 16) 87 | (logxor (crc32-low state) #xFFFF))) 88 | ) 89 | -------------------------------------------------------------------------------- /decompress.lisp: -------------------------------------------------------------------------------- 1 | (in-package :chipz) 2 | 3 | ;;; We provide several convenience functions for decompression: 4 | ;;; 5 | ;;; * decompress a buffer to a newly-consed buffer; 6 | ;;; * decompress a stream to a newly-consed buffer; 7 | ;;; * decompress a pathname to a newly-consed buffer; 8 | ;;; * decompress a buffer to a user-specified buffer; 9 | ;;; * decompress a buffer to a stream; 10 | ;;; * decompress a stream to a stream. 11 | ;;; * decompress a pathname to another pathname; 12 | ;;; * decompress a pathname to a stream; 13 | ;;; 14 | ;;; We do not provide stream->buffer decompression, as we have no way of 15 | ;;; knowing how much to read from the stream to fill the buffer, no way 16 | ;;; of determining what to do with possible state left in the 17 | ;;; INFLATE-STATE that we used, etc. Application-specific logic will 18 | ;;; have to handle those bits. 19 | 20 | (defgeneric decompress (output state input &key &allow-other-keys) 21 | (:method (output format input &rest keys) 22 | (%decompress output format input keys)) 23 | ;; Accommodate people who want to use lists as input, possibly for 24 | ;; experimenting with the API. 25 | (:method (output format (input list) &rest keys) 26 | (let ((vector (coerce input '(simple-array (unsigned-byte 8) (*))))) 27 | (%decompress output format vector keys)))) 28 | 29 | (defun %decompress (output format input keys) 30 | (let ((state (make-dstate format))) 31 | (multiple-value-prog1 (apply #'decompress output state input keys) 32 | (finish-dstate state)))) 33 | 34 | ;;; SUBSEQ is specified to always make a copy. But we don't want an 35 | ;;; exact copy of a freshly-consed vector; that'd be wasteful. 36 | (defun maybe-subseq (v end) 37 | (if (= end (length v)) 38 | v 39 | (subseq v 0 end))) 40 | 41 | (defun decompress-fun-for-state (state) 42 | (typecase state 43 | (inflate-state #'%inflate) 44 | (bzip2-state #'%bzip2-decompress))) 45 | 46 | ;; For convenience. 47 | (defun %decompress-from-pathname (output state pathname buffer-size) 48 | (with-open-file (stream pathname :element-type '(unsigned-byte 8) 49 | :direction :input) 50 | (decompress output state stream 51 | :buffer-size (if (eq buffer-size :file-length) 52 | (file-length stream) 53 | buffer-size)))) 54 | 55 | (defmethod decompress ((output null) (state decompression-state) (input pathname) 56 | &key) 57 | (%decompress-from-pathname output state input :file-length)) 58 | 59 | (defmethod decompress ((output pathname) (state decompression-state) (input pathname) 60 | &key buffer-size) 61 | (check-type buffer-size (or null integer)) 62 | (with-open-file (stream output :element-type '(unsigned-byte 8) 63 | :direction :output) 64 | (%decompress-from-pathname stream state input buffer-size))) 65 | 66 | (defmethod decompress ((output stream) (state decompression-state) (input pathname) 67 | &key buffer-size) 68 | (check-type buffer-size (or null integer)) 69 | (%decompress-from-pathname output state input buffer-size)) 70 | 71 | (defun %decompress/null-vector (state input fun 72 | input-start input-end buffer-size) 73 | (declare (type function fun)) 74 | (loop 75 | with output = (make-array buffer-size :element-type '(unsigned-byte 8)) 76 | with output-start = 0 77 | do (cond 78 | ((= output-start (length output)) 79 | ;; Reallocate the output buffer. 80 | (let ((new (make-array (* 2 (length output)) 81 | :element-type '(unsigned-byte 8)))) 82 | (setf output (replace new output)))) 83 | (t 84 | (multiple-value-bind (consumed produced) 85 | (funcall fun state input output 86 | :input-start input-start :input-end input-end 87 | :output-start output-start :output-end (length output)) 88 | (incf input-start consumed) 89 | (incf output-start produced) 90 | (when (or (dstate-done state) 91 | (and (or (>= input-start input-end) 92 | (zerop consumed)) 93 | (zerop produced))) 94 | (return-from %decompress/null-vector (maybe-subseq output output-start)))))))) 95 | 96 | (defmethod decompress ((output null) (state decompression-state) (input vector) 97 | &key (input-start 0) input-end buffer-size) 98 | (%decompress/null-vector state input 99 | (decompress-fun-for-state state) 100 | input-start (or input-end (length input)) 101 | (or buffer-size +default-buffer-size+))) 102 | 103 | (defun %decompress/null-stream (state input fun buffer-size) 104 | (declare (type function fun)) 105 | (let ((input-buffer (make-array 8192 :element-type '(unsigned-byte 8)))) 106 | (declare (dynamic-extent input-buffer)) 107 | (loop 108 | with input-start = 0 109 | with input-end = 0 110 | with output = (make-array buffer-size :element-type '(unsigned-byte 8)) 111 | with output-start = 0 112 | initially (setf input-end (read-sequence input-buffer input)) 113 | do (cond 114 | ((= output-start (length output)) 115 | ;; Reallocate the output buffer. 116 | (let ((new (make-array (* 2 (length output)) 117 | :element-type '(unsigned-byte 8)))) 118 | (setf output (replace new output)))) 119 | (t 120 | (multiple-value-bind (consumed produced) 121 | (funcall fun state input-buffer output 122 | :input-start input-start :input-end input-end 123 | :output-start output-start) 124 | (incf input-start consumed) 125 | (incf output-start produced) 126 | (let ((input-consumed-p (>= input-start input-end))) 127 | ;; Get more input if possible. 128 | (when input-consumed-p 129 | (setf input-start 0 130 | input-end (read-sequence input-buffer input))) 131 | (when (or (dstate-done state) 132 | (and (or (and input-consumed-p (zerop input-end)) 133 | (zerop consumed)) 134 | (zerop produced))) 135 | (return-from %decompress/null-stream (maybe-subseq output output-start)))))))))) 136 | 137 | (defmethod decompress ((output null) (state decompression-state) (input stream) 138 | &key buffer-size) 139 | (%decompress/null-stream state input 140 | (decompress-fun-for-state state) 141 | (or buffer-size +default-buffer-size+))) 142 | 143 | (defun %decompress/vector-vector (output state input fun 144 | input-start input-end 145 | output-start output-end) 146 | (declare (type simple-octet-vector input output)) 147 | (declare (type function fun)) 148 | (loop 149 | with n-bytes-consumed = 0 and n-bytes-produced = 0 150 | do (multiple-value-bind (consumed produced) 151 | (funcall fun state input output 152 | :input-start input-start :input-end input-end 153 | :output-start output-start :output-end output-end) 154 | (incf input-start consumed) 155 | (incf output-start produced) 156 | (incf n-bytes-consumed consumed) 157 | (incf n-bytes-produced produced) 158 | (when (and (or (>= input-start input-end) 159 | (zerop consumed)) 160 | (or (>= output-start output-end) 161 | (zerop produced))) 162 | (return-from %decompress/vector-vector 163 | (values n-bytes-consumed n-bytes-produced)))))) 164 | 165 | (defmethod decompress ((output vector) (state decompression-state) (input vector) 166 | &key (input-start 0) input-end 167 | (output-start 0) output-end) 168 | (%decompress/vector-vector output state input 169 | (decompress-fun-for-state state) 170 | input-start (or input-end (length input)) 171 | output-start (or output-end (length output)))) 172 | 173 | (defun %decompress/stream-vector (output state input fun input-start input-end) 174 | (declare (type function fun)) 175 | (let ((buffer (make-array 8192 :element-type '(unsigned-byte 8)))) 176 | (declare (dynamic-extent buffer)) 177 | (loop (multiple-value-bind (consumed produced) 178 | (funcall fun state input buffer 179 | :input-start input-start :input-end input-end) 180 | (incf input-start consumed) 181 | (write-sequence buffer output :end produced) 182 | (when (or (dstate-done state) 183 | (and (or (>= input-start input-end) 184 | (zerop consumed)) 185 | (zerop produced))) 186 | (return-from %decompress/stream-vector output)))))) 187 | 188 | (defmethod decompress ((output stream) (state decompression-state) (input vector) 189 | &key (input-start 0) input-end) 190 | (%decompress/stream-vector output state input 191 | (decompress-fun-for-state state) 192 | input-start (or input-end (length input)))) 193 | 194 | (defun %decompress/stream-stream (output state input fun) 195 | (declare (type function fun)) 196 | (let ((input-buffer (make-array 8192 :element-type '(unsigned-byte 8))) 197 | (output-buffer (make-array 8192 :element-type '(unsigned-byte 8)))) 198 | (declare (dynamic-extent input-buffer output-buffer)) 199 | (loop 200 | with input-start = 0 201 | with input-end = 0 202 | initially (setf input-end (read-sequence input-buffer input)) 203 | do (multiple-value-bind (consumed produced) 204 | (funcall fun state input-buffer output-buffer 205 | :input-start input-start :input-end input-end) 206 | (incf input-start consumed) 207 | (write-sequence output-buffer output :end produced) 208 | (let ((input-consumed-p (>= input-start input-end))) 209 | (when input-consumed-p 210 | (setf input-start 0 211 | input-end (read-sequence input-buffer input))) 212 | (when (or (dstate-done state) 213 | (and (or (and input-consumed-p (zerop input-end)) 214 | (zerop consumed)) 215 | (zerop produced))) 216 | (return-from %decompress/stream-stream output))))))) 217 | 218 | (defmethod decompress ((output stream) (state decompression-state) (input stream) 219 | &key) 220 | (%decompress/stream-stream output state input 221 | (decompress-fun-for-state state))) 222 | -------------------------------------------------------------------------------- /doc/chipz-doc.txt: -------------------------------------------------------------------------------- 1 | (:author "Nathan Froyd" 2 | :email "froydnj@gmail.com" 3 | :package "Chipz" 4 | :cl-package "CHIPZ" 5 | :version #.(asdf:component-version (asdf:find-system :chipz)) 6 | :homepage "http://www.method-combination.net/lisp/chipz/" 7 | :download "http://www.method-combination.net/lisp/files/chipz.tar.gz") 8 | 9 | (:h1 ${package}) 10 | 11 | (:p ${package} " is a library for decompressing DEFLATE and BZIP2 data. 12 | DEFLATE data, defined in " (:url "http://www.ietf.org/rfc/rfc1951.txt" 13 | "RFC1951") ", forms the core of popular compression formats such as 14 | zlib (" (:url "http://www.ietf.org/rfc/rfc1950.txt" "RFC 1950") ") and 15 | gzip (" (:url "http://www.ietf.org/rfc/rfc1952.txt" "RFC 1952") "). As 16 | such, " ${package} " also provides for decompressing data in those 17 | formats as well. BZIP2 is the format used by the popular compression 18 | tool " (:url "http://www.bzip.org/" "bzip2") ".") 19 | 20 | (:p ${package} " is the reading complement to " (:url 21 | "http://www.xach.com/salza2/" "Salza") ".") 22 | 23 | (:h2 "Installation") 24 | 25 | (:p ${package} " can be downloaded at " (:url ${download} ${download}) 26 | ". The latest version is " ${version} ".") 27 | 28 | (:p "It comes with an ASDF system definition, so " `(ASDF:OOS 29 | 'ASDF:LOAD-OP :CHIPZ)` " should be all that you need to get started.") 30 | 31 | (:h2 "License") 32 | 33 | (:p ${package} " is released under a MIT-like license; you can do pretty 34 | much anything you want to with the code except claim that you wrote 35 | it.") 36 | 37 | (:h2 "Using the library") 38 | 39 | (:p "The main function of the library is " `decompress` ":") 40 | 41 | (:describe :generic-function (chipz:decompress output)) 42 | 43 | (:p "Five distinct use cases are covered by this single function:") 44 | 45 | (:ul 46 | (:li "Decompressing from an octet vector to a fresh octet vector;") 47 | (:li "Decompressing from a stream to a fresh octet vector;") 48 | (:li "Decompressing from an octet vector to a user-specified octet vector;") 49 | (:li "Decompressing from an octet vector to a stream;") 50 | (:li "Decompressing from a stream to a stream;")) 51 | 52 | (:note ${package} " does not provide for decompressing data from a stream 53 | to a user-specified buffer, as the buffer management involved cannot be 54 | done automatically by the library--the application must be involved in 55 | this case.") 56 | 57 | (:h3 ((:a name "one-shot")) "One-shot decompression") 58 | 59 | (:p "The first and second use cases above are intended to be convenient 60 | \"one-shot\" decompression methods. Therefore, although the description 61 | of the following methods attached to this generic function have an " 62 | `decompression-state` " parameter, as returned by " @make-dstate ", 63 | respectively, the usual way to use them will be to provide a " 64 | `format` " argument. This " `format` " argument should be one of:") 65 | 66 | (:ul 67 | (:li `chipz:bzip2` " for decompressing data in the bzip2 format;") 68 | (:li `chipz:gzip` " for decompressing data in the gzip format;") 69 | (:li `chipz:zlib` " for decompressing data in the zlib format;") 70 | (:li `chipz:deflate` " for decompressing data in the deflate format.")) 71 | 72 | (:p "The " `format` " argument can also be a keyword, such as " 73 | `:gzip` ", for backwards compatibility. Using symbols in the " `CHIPZ` 74 | " package is preferred, however.") 75 | 76 | (:p "Most applications will use " `chipz:gzip` " or " `chipz:bzip2` ", a 77 | few applications will use " `chipz:zlib` ", and uses of " 78 | `chipz:deflate` " will probably be few and far between.") 79 | 80 | (:p "All the method signatures described below also accept a " 81 | `format` " argument in lieu of an " `decompression-state` " argument.") 82 | 83 | (:p "The signatures of the first two methods are as follows.") 84 | 85 | (:describe :method (chipz:decompress (null chipz:decompression-state vector) output)) 86 | (:describe :method (chipz:decompress (null chipz:decompression-state stream) output)) 87 | 88 | (:p "A simple function to retrieve the contents of a gzip-compressed 89 | file, then, might be:") 90 | 91 | (:pre 92 | "(defun gzip-contents (pathname) 93 | (with-open-file (stream pathname :direction :input 94 | :element-type '(unsigned-byte 8)) 95 | (chipz:decompress nil 'chipz:gzip stream)))") 96 | 97 | (:p "These one-shot methods also support a " `:buffer-size` " argument 98 | as a hint of the size of decompressed data. The library uses this to 99 | pre-allocate the output buffer to the hinted size. Therefore, if you 100 | know the size of the decompressed data or have a good estimate, fewer 101 | allocations will be done, leading to slightly better performance. If " 102 | `:buffer-size` " is not provided or proves to be too small, the library 103 | will of course grow the output buffer as necessary.") 104 | 105 | (:h3 "Decompressing to a vector") 106 | 107 | (:p "An alternate way to deal with compressed data is to read in a 108 | buffer's worth of data, decompress the buffer, and then deal with any 109 | remaining input and the produced output, looping to read and process 110 | more data as appropriate. This scheme is the third use case 111 | described above and is handled in zlib with the " (:tt "inflate") " 112 | function. In " ${package} ", it is just another method of " `decompress` 113 | ".") 114 | 115 | (:describe :method (chipz:decompress (vector chipz:decompression-state vector) (values n-bytes-consumed n-bytes-produced))) 116 | 117 | (:p "This method decompresses the data from " 'input' " between " 118 | 'input-start' " and " 'input-end' " and place the uncompressed data in " 119 | 'output' ", limited by " 'output-start' " and " 'output-end' ". Please 120 | note that it is possible to consume some or all of the input without 121 | producing any output and to produce some or all of the output without 122 | consuming any input.") 123 | 124 | (:p "As above, you can use a " `format` " argument instead of an " 125 | `decompression-state` ". You will usually not want to do this unless 126 | you know exactly how large the decompressed data is going to be; 127 | otherwise, you will only decompress a portion of the data and any 128 | intermediate state required to decompress the remainder of the data will 129 | be thrown away.") 130 | 131 | (:h3 "Decompressing to a stream") 132 | 133 | (:p "Finally, " `decompress` " can also be used to write the 134 | decompressed data directly to a stream, enabling a poor man's gunzip 135 | function:") 136 | 137 | (:pre "(defun gunzip (gzip-filename output-filename) 138 | (with-open-file (gzstream gzip-filename :direction :input 139 | :element-type '(unsigned-byte 8)) 140 | (with-open-file (stream output-filename :direction :output 141 | :element-type '(unsigned-byte 8) 142 | :if-exists :supersede) 143 | (chipz:decompress stream 'chipz:gzip gzstream) 144 | output-filename)))") 145 | 146 | (:p "The relevant methods in this case are:") 147 | 148 | (:describe :method (chipz:decompress (stream chipz:decompression-state vector) stream)) 149 | (:describe :method (chipz:decompress (stream chipz:decompression-state stream) stream)) 150 | 151 | (:p "Both return the output stream.") 152 | 153 | (:h3 "Creating " `decompression-state` " objects") 154 | 155 | (:p "The core data structure of " ${package} " is a " 156 | `decompression-state` ", which stores the internal state of an ongoing 157 | decompression process. You create a " `decompression-state` " with " 158 | @make-dstate ".") 159 | 160 | (:describe :function (chipz:make-dstate dstate)) 161 | 162 | (:p "Return an " `decompression-state` " object suitable for 163 | uncompressing data in " 'data-format' ". " 'data-format' " should be:") 164 | 165 | (:ul 166 | (:li `chipz:bzip2` " for decompressing data in the bzip2 format;") 167 | (:li `chipz:gzip` " for decompressing data in the gzip format;") 168 | (:li `chipz:zlib` " for decompressing data in the zlib format;") 169 | (:li `chipz:deflate` " for decompressing data in the deflate format.")) 170 | 171 | (:p "As with " @decompress ", you can use keywords instead, but doing so 172 | is deprecated.") 173 | 174 | (:p "Prior to adding bzip2 support, " ${package} " supported only 175 | deflate-based formats. " @make-inflate-state " was the primary 176 | interface then; it is now deprecated, but kept around for backwards 177 | compatibility.") 178 | 179 | (:describe :function (chipz:make-inflate-state inflate-state)) 180 | 181 | (:p @make-inflate-state " supports the same " 'data-format' " arguments 182 | as " @make-dstate " does, with the obvious exception of " 183 | 'chipz:bzip2' ". The " `inflate-state` " object returned is a " 184 | `decompression-state` ", so it can be passed to " @decompress " and " 185 | @finish-dstate ".") 186 | 187 | (:p "Once you are done with a " `decompression-state` " object, you must 188 | call " @finish-dstate " on it. " @finish-dstate " checks that the 189 | given " 'state' " decompressed all the data in a given stream. It does 190 | not dispose of any resources associated with " 'state' "; it is meant 191 | purely as an error-checking construct. Therefore, it is inappropriate 192 | to call from, say, the cleanup forms of " (:tt "UNWIND-PROTECT") ". The 193 | cleanup forms may be run when an error is thrown during decompression 194 | and of course the stream will only be partially decompressed at that 195 | point.") 196 | 197 | (:describe :function (chipz:finish-dstate t)) 198 | 199 | (:p @finish-inflate-state " does the same thing, but only for " 200 | `inflate-state` ". Its use, like that of " @make-inflate-state " is 201 | deprecated.") 202 | 203 | (:describe :function (chipz:finish-inflate-state t)) 204 | 205 | (:h2 "Gray streams") 206 | 207 | (:p ${package} " includes support for creating Gray streams to wrap 208 | streams containing compressed data and read the uncompressed data from 209 | those streams. SBCL, Allegro, Lispworks, CMUCL, and OpenMCL are 210 | supported at this time.") 211 | 212 | (:describe :function (chipz::make-decompressing-stream decompressing-stream)) 213 | 214 | (:p "Return a stream that provides transparent decompression of the data 215 | from " 'stream' " in " 'format' ". That is, " `read-byte` " and " 216 | `read-sequence` " will decompress the data read from " 'stream' " and 217 | return portions of the decompressed data as requested. " 'format' " is 218 | as in the " ((:a href "#one-shot") "one-shot decompression 219 | methods") ".") 220 | 221 | (:h2 "Conditions") 222 | 223 | (:describe :condition chipz-error) 224 | 225 | (:p "All errors signaled by " ${package} " are of this type.") 226 | 227 | (:describe :condition invalid-format-error) 228 | 229 | (:p "This error is signaled when the " 'format' " argument to " 230 | @decompress " or " @make-dstate " is not one of the symbols specified 231 | for " @make-dstate ". This error is also signaled in " 232 | @make-inflate-state " if the " 'format' " argument is not valid for that 233 | function.") 234 | 235 | (:describe :condition decompression-error) 236 | 237 | (:p "All errors signaled during decompression are of this type.") 238 | 239 | (:describe :condition invalid-checksum-error) 240 | 241 | (:p "The zlib, gzip, and bzip2 formats all contain checksums to verify 242 | the integrity of the uncompressed data; this error is signaled when the 243 | stored checksum is found to be inconsistent with the checksum computed 244 | by " ${package} ". It indicates that the compressed data has probably 245 | been corrupted in some fashion (or there is an error in " ${package} 246 | ").") 247 | 248 | (:describe :condition premature-end-of-stream) 249 | 250 | (:p "This error is signaled when " @finish-dstate " is 251 | called on an " `decompression-state` " that has not finished processing 252 | an entire decompressed data stream.") 253 | 254 | (:describe :condition inflate-error) 255 | 256 | (:p "All errors signaled while decompressing deflate-based formats are 257 | of this type.") 258 | 259 | (:describe :condition invalid-zlib-header-error) 260 | 261 | (:p "This error is signaled when an invalid zlib header is read.") 262 | 263 | (:describe :condition invalid-gzip-header-error) 264 | 265 | (:p "This error is signaled when an invalid gzip header is read.") 266 | 267 | (:describe :condition reserved-block-type-error) 268 | 269 | (:p "This error is signaled when a deflate block is read whose 270 | type is 3. This type is reserved for future expansion and should 271 | not be found in the wild.") 272 | 273 | (:describe :condition invalid-stored-block-length-error) 274 | 275 | (:p "This error is signaled when the length of a deflate stored 276 | block is found to be invalid.") 277 | 278 | (:describe :condition bzip2-error) 279 | 280 | (:p "All errors signaled while decompressing bzip2-based formats are of 281 | this type.") 282 | 283 | (:describe :condition invalid-bzip2-data) 284 | 285 | (:p "This error is signaled when the compressed bzip2 data is found to 286 | be corrupt in some way that prevents further decompression.") 287 | -------------------------------------------------------------------------------- /doc/index.html: -------------------------------------------------------------------------------- 1 | 3 | Chipz

Chipz

Chipz is a library for decompressing DEFLATE and BZIP2 data. 4 | DEFLATE data, defined in RFC1951, forms the core of popular compression formats such as 5 | zlib (RFC 1950) and 6 | gzip (RFC 1952). As 7 | such, Chipz also provides for decompressing data in those 8 | formats as well. BZIP2 is the format used by the popular compression 9 | tool bzip2.

Chipz is the reading complement to Salza.

Installation

Chipz can be downloaded at http://www.method-combination.net/lisp/files/chipz.tar.gz. The latest version is 0.8.

It comes with an ASDF system definition, so (ASDF:OOS 10 | 'ASDF:LOAD-OP :CHIPZ) should be all that you need to get started.

License

Chipz is released under a MIT-like license; you can do pretty 11 | much anything you want to with the code except claim that you wrote 12 | it.

Using the library

The main function of the library is decompress:

decompress output state input &key &allow-other-keys => output

Five distinct use cases are covered by this single function:

NoteChipz does not provide for decompressing data from a stream 13 | to a user-specified buffer, as the buffer management involved cannot be 14 | done automatically by the library--the application must be involved in 15 | this case.

One-shot decompression

The first and second use cases above are intended to be convenient 16 | "one-shot" decompression methods. Therefore, although the description 17 | of the following methods attached to this generic function have an decompression-state parameter, as returned by make-dstate, 18 | respectively, the usual way to use them will be to provide a format argument. This format argument should be one of:

The format argument can also be a keyword, such as :gzip, for backwards compatibility. Using symbols in the CHIPZ package is preferred, however.

Most applications will use chipz:gzip or chipz:bzip2, a 19 | few applications will use chipz:zlib, and uses of chipz:deflate will probably be few and far between.

All the method signatures described below also accept a format argument in lieu of an decompression-state argument.

The signatures of the first two methods are as follows.

decompress (output null) (state decompression-state) (input vector) &key (input-start 0) input-end buffer-size => output
decompress (output null) (state decompression-state) (input stream) &key buffer-size => output

A simple function to retrieve the contents of a gzip-compressed 20 | file, then, might be:

(defun gzip-contents (pathname)
21 |   (with-open-file (stream pathname :direction :input
22 |                                    :element-type '(unsigned-byte 8))
23 |     (chipz:decompress nil 'chipz:gzip stream)))

These one-shot methods also support a :buffer-size argument 24 | as a hint of the size of decompressed data. The library uses this to 25 | pre-allocate the output buffer to the hinted size. Therefore, if you 26 | know the size of the decompressed data or have a good estimate, fewer 27 | allocations will be done, leading to slightly better performance. If :buffer-size is not provided or proves to be too small, the library 28 | will of course grow the output buffer as necessary.

Decompressing to a vector

An alternate way to deal with compressed data is to read in a 29 | buffer's worth of data, decompress the buffer, and then deal with any 30 | remaining input and the produced output, looping to read and process 31 | more data as appropriate. This scheme is the third use case 32 | described above and is handled in zlib with the inflate 33 | function. In Chipz, it is just another method of decompress.

decompress (output vector) (state decompression-state) (input vector) &key (input-start 0) input-end (output-start 0) output-end => n-bytes-consumed, n-bytes-produced

This method decompresses the data from input between input-start and input-end and place the uncompressed data in output, limited by output-start and output-end. Please 34 | note that it is possible to consume some or all of the input without 35 | producing any output and to produce some or all of the output without 36 | consuming any input.

As above, you can use a format argument instead of an decompression-state. You will usually not want to do this unless 37 | you know exactly how large the decompressed data is going to be; 38 | otherwise, you will only decompress a portion of the data and any 39 | intermediate state required to decompress the remainder of the data will 40 | be thrown away.

Decompressing to a stream

Finally, decompress can also be used to write the 41 | decompressed data directly to a stream, enabling a poor man's gunzip 42 | function:

(defun gunzip (gzip-filename output-filename)
43 |   (with-open-file (gzstream gzip-filename :direction :input
44 |                             :element-type '(unsigned-byte 8))
45 |     (with-open-file (stream output-filename :direction :output
46 |                             :element-type '(unsigned-byte 8)
47 |                             :if-exists :supersede)
48 |       (chipz:decompress stream 'chipz:gzip gzstream)
49 |       output-filename)))

The relevant methods in this case are:

decompress (output stream) (state decompression-state) (input vector) &key (input-start 0) input-end => stream
decompress (output stream) (state decompression-state) (input stream) => stream

Both return the output stream.

Creating decompression-state objects

The core data structure of Chipz is a decompression-state, which stores the internal state of an ongoing 50 | decompression process. You create a decompression-state with make-dstate.

make-dstate format => dstate

Return an decompression-state object suitable for 51 | uncompressing data in data-format. data-format should be:

As with decompress, you can use keywords instead, but doing so 52 | is deprecated.

Prior to adding bzip2 support, Chipz supported only 53 | deflate-based formats. make-inflate-state was the primary 54 | interface then; it is now deprecated, but kept around for backwards 55 | compatibility.

make-inflate-state format => inflate-state

make-inflate-state supports the same data-format arguments 56 | as make-dstate does, with the obvious exception of chipz:bzip2. The inflate-state object returned is a decompression-state, so it can be passed to decompress and finish-dstate.

Once you are done with a decompression-state object, you must 57 | call finish-dstate on it. finish-dstate checks that the 58 | given state decompressed all the data in a given stream. It does 59 | not dispose of any resources associated with state; it is meant 60 | purely as an error-checking construct. Therefore, it is inappropriate 61 | to call from, say, the cleanup forms of UNWIND-PROTECT. The 62 | cleanup forms may be run when an error is thrown during decompression 63 | and of course the stream will only be partially decompressed at that 64 | point.

finish-dstate state => t

finish-inflate-state does the same thing, but only for inflate-state. Its use, like that of make-inflate-state is 65 | deprecated.

finish-inflate-state state => t

Gray streams

Chipz includes support for creating Gray streams to wrap 66 | streams containing compressed data and read the uncompressed data from 67 | those streams. SBCL, Allegro, Lispworks, CMUCL, and OpenMCL are 68 | supported at this time.

make-decompressing-stream format stream => decompressing-stream

Return a stream that provides transparent decompression of the data 69 | from stream in format. That is, read-byte and read-sequence will decompress the data read from stream and 70 | return portions of the decompressed data as requested. format is 71 | as in the one-shot decompression 72 | methods.

Conditions

chipz-error

All errors signaled by Chipz are of this type.

invalid-format-error

This error is signaled when the format argument to decompress or make-dstate is not one of the symbols specified 73 | for make-dstate. This error is also signaled in make-inflate-state if the format argument is not valid for that 74 | function.

decompression-error

All errors signaled during decompression are of this type.

invalid-checksum-error

The zlib, gzip, and bzip2 formats all contain checksums to verify 75 | the integrity of the uncompressed data; this error is signaled when the 76 | stored checksum is found to be inconsistent with the checksum computed 77 | by Chipz. It indicates that the compressed data has probably 78 | been corrupted in some fashion (or there is an error in Chipz).

premature-end-of-stream

This error is signaled when finish-dstate is 79 | called on an decompression-state that has not finished processing 80 | an entire decompressed data stream.

inflate-error

All errors signaled while decompressing deflate-based formats are 81 | of this type.

invalid-zlib-header-error

This error is signaled when an invalid zlib header is read.

invalid-gzip-header-error

This error is signaled when an invalid gzip header is read.

reserved-block-type-error

This error is signaled when a deflate block is read whose 82 | type is 3. This type is reserved for future expansion and should 83 | not be found in the wild.

invalid-stored-block-length-error

This error is signaled when the length of a deflate stored 84 | block is found to be invalid.

bzip2-error

All errors signaled while decompressing bzip2-based formats are of 85 | this type.

invalid-bzip2-data

This error is signaled when the compressed bzip2 data is found to 86 | be corrupt in some way that prevents further decompression.

-------------------------------------------------------------------------------- /doc/style.css: -------------------------------------------------------------------------------- 1 | body { 2 | margin: 1em 5% 1em 5%; 3 | } 4 | 5 | p { 6 | margin-top: 0.5em; 7 | margin-bottom: 0.5em; 8 | } 9 | 10 | pre { 11 | padding: 0; 12 | margin: 0; 13 | } 14 | 15 | h1, h2 { 16 | border-bottom: 2px solid #449977; 17 | } 18 | 19 | h1, h2, h3, h4, h5, h6 { 20 | font-family: sans-serif; 21 | line-height: 1.3; 22 | } 23 | 24 | a:link { 25 | color: #449977; 26 | } 27 | 28 | a:visited { 29 | color: purple; 30 | } 31 | 32 | a { 33 | text-decoration: none; 34 | padding: 1px 2px; 35 | } 36 | 37 | a:hover { 38 | text-decoration: none; 39 | padding: 1px; 40 | border: 1px solid #000000; 41 | } 42 | 43 | .lisp-symbol { 44 | margin-right: 10%; 45 | margin-top: 1.5em; 46 | margin-bottom: 1.5em; 47 | border: 1px solid #449977; 48 | background: #eeeeee; 49 | padding: 0.5em; 50 | } 51 | 52 | .note { 53 | margin-right: 10%; 54 | margin-top: 1.5em; 55 | margin-bottom: 1.5em; 56 | } 57 | 58 | td.content { 59 | padding: 0; 60 | } 61 | 62 | td.title { 63 | font-family: sans-serif; 64 | font-size: 1.1em; 65 | font-weight: bold; 66 | text-align: left; 67 | vertical-align: top; 68 | text-decoration: underline; 69 | padding-right: 0.5em; 70 | margin-top: 0.0em; 71 | margin-bottom: 0.5em; 72 | } 73 | 74 | .note td.content { 75 | padding-left: 0.5em; 76 | border-left: 2px solid #449977; 77 | } 78 | -------------------------------------------------------------------------------- /dstate.lisp: -------------------------------------------------------------------------------- 1 | ;;;; dstate.lisp -- common bits for decompression state 2 | 3 | (in-package :chipz) 4 | 5 | ;;; This structure is never meant to be instantiated. It exists only to 6 | ;;; provide common framework for other decompressors. 7 | (defstruct (decompression-state 8 | (:constructor) 9 | (:conc-name dstate-)) 10 | (state nil :type (or null function)) 11 | (done nil) 12 | 13 | (input (make-array 1 :element-type '(unsigned-byte 8)) 14 | :type simple-octet-vector) 15 | (input-start 0 :type (and fixnum (integer 0 *))) 16 | (input-index 0 :type (and fixnum (integer 0 *))) 17 | (input-end 0 :type (and fixnum (integer 0 *))) 18 | 19 | (output (make-array 1 :element-type '(unsigned-byte 8)) 20 | :type simple-octet-vector) 21 | (output-start 0 :type (and fixnum (integer 0 *))) 22 | (output-index 0 :type (and fixnum (integer 0 *))) 23 | (output-end 0 :type (and fixnum (integer 0 *))) 24 | 25 | ;; Checksums of various sorts. 26 | (checksum nil) 27 | (update-checksum nil :type (or null function)) 28 | 29 | ;; Bit buffer. 30 | (bits 0 :type (unsigned-byte 32)) 31 | (n-bits 0 :type (integer 0 32))) 32 | 33 | (defun make-dstate (format) 34 | "Return a structure suitable for uncompressing data in DATA-FORMAT; 35 | DATA-FORMAT should be: 36 | 37 | :BZIP2 or CHIPZ:BZIP2 For decompressing data in the `bzip2' format; 38 | :GZIP or CHIPZ:GZIP For decompressing data in the `gzip' format; 39 | :ZLIB or CHIPZ:ZLIB For decompressing data in the `zlib' format; 40 | :DEFLATE or CHIPZ:DEFLATE For decompressing data in the `deflate' format. 41 | 42 | The usual value of DATA-FORMAT will be one of CHIPZ:BZIP2 or CHIPZ:GZIP." 43 | (case format 44 | ((:deflate :zlib :gzip 45 | deflate zlib gzip) 46 | (make-inflate-state format)) 47 | ((:bzip2 bzip2) 48 | (make-bzip2-state)) 49 | (t 50 | (error 'invalid-format-error :format format)))) 51 | 52 | (defun finish-dstate (state) 53 | (unless (dstate-done state) 54 | (error 'premature-end-of-stream)) 55 | t) 56 | -------------------------------------------------------------------------------- /gzip.lisp: -------------------------------------------------------------------------------- 1 | ;;;; gzip.lisp -- dealing with gzip-wrapped deflate data 2 | 3 | (in-package :chipz) 4 | 5 | (defclass gzip-header () 6 | ((flags :initarg :flags :accessor flags) 7 | (filename :initform nil :accessor filename) 8 | (write-date :initarg :write-date :accessor write-date) 9 | (mtime :initform 0 :accessor mtime) 10 | (comment :initform nil :accessor comment) 11 | (extra-flags :initarg :extra-flags :accessor extra-flags) 12 | (os :initarg :os :accessor os) 13 | (crc16 :initarg :crc16 :accessor crc16) 14 | (compression-method :initarg :compression-method :accessor compression-method))) 15 | 16 | ;;; individual bit meanings in the flag field 17 | (defconstant +gzip-flag-text+ 0) 18 | (defconstant +gzip-flag-crc+ 1) 19 | (defconstant +gzip-flag-extra+ 2) 20 | (defconstant +gzip-flag-name+ 3) 21 | (defconstant +gzip-flag-comment+ 4) 22 | 23 | ;;; values of the compression method byte 24 | (defconstant +gzip-deflate-method+ 8) 25 | 26 | ;;; values of the extra flag field 27 | (defconstant +gzip-xfl-max-compression+ 2) 28 | (defconstant +gzip-xfl-fast-compression+ 4) 29 | -------------------------------------------------------------------------------- /inflate-state.lisp: -------------------------------------------------------------------------------- 1 | ;;; inflate-state.lisp -- definition of an inflate state 2 | 3 | (in-package :chipz) 4 | 5 | (deftype sliding-window () '(simple-array (unsigned-byte 8) (32768))) 6 | 7 | (defstruct (inflate-state 8 | (:include decompression-state) 9 | (:constructor %make-inflate-state (data-format))) 10 | ;; whether the current block being processed is the last one 11 | (final-block-p nil :type (member t nil)) 12 | ;; the number of bytes to copy for uncompressed blocks 13 | (length 0) 14 | ;; the code for length/distance codes 15 | (distance 0) 16 | (length-code 0 :type (integer 0 28)) 17 | (distance-code 0 :type (integer 0 31)) 18 | ;; values for dynamic blocks 19 | (n-length-codes 0) 20 | (n-distance-codes 0) 21 | (n-codes 0) 22 | (n-values-read 0) 23 | (code-lengths (make-array 288) :type (simple-vector 288)) 24 | ;; sliding window 25 | (window (make-array 32768 :element-type '(unsigned-byte 8)) 26 | :type sliding-window) 27 | ;; position in the sliding window 28 | (window-index 0 :type (mod 32768)) 29 | ;; codes table for dynamically compressed blocks 30 | (codes-table nil) 31 | ;; literal/length table for compressed blocks 32 | (literal/length-table *fixed-literal/length-table* 33 | :type huffman-decode-table) 34 | ;; distance table for compressed blocks 35 | (distance-table *fixed-distance-table* :type huffman-decode-table) 36 | ;; header for wrapped data, or NIL if raw deflate data 37 | (header nil) 38 | ;; format of the compressed data that we're reading 39 | (data-format 'deflate :type (member deflate zlib gzip))) 40 | 41 | (defun make-inflate-state (format) 42 | "Return a INFLATE-STATE structure suitable for uncompressing data in 43 | FORMAT; FORMAT should be: 44 | 45 | :GZIP or CHIPZ:GZIP For decompressing data in the `gzip' format; 46 | :ZLIB or CHIPZ:ZLIB For decompressing data in the `zlib' format; 47 | :DEFLATE or CHIPZ:DEFLATE For decompressing data in the `deflate' format. 48 | 49 | The usual value of FORMAT will be one of CHIPZ:GZIP or CHIPZ:ZLIB." 50 | (let* ((f (case format 51 | ((:gzip gzip) 'gzip) 52 | ((:zlib zlib) 'zlib) 53 | ((:deflate deflate) 'deflate) 54 | (t 55 | (error 'invalid-format-error :format format)))) 56 | (state (%make-inflate-state f))) 57 | (case f 58 | (gzip 59 | (setf (dstate-checksum state) (make-crc32) 60 | (dstate-update-checksum state) #'update-crc32)) 61 | (zlib 62 | (setf (dstate-checksum state) (make-adler32) 63 | (dstate-update-checksum state) #'update-adler32))) 64 | state)) 65 | 66 | (defun finish-inflate-state (state) 67 | (unless (inflate-state-done state) 68 | (error 'premature-end-of-stream)) 69 | t) 70 | 71 | (defmethod print-object ((object inflate-state) stream) 72 | (print-unreadable-object (object stream) 73 | (format stream "Inflate-State input ~D/~D; output ~D/~D" 74 | (- (inflate-state-input-index object) 75 | (inflate-state-input-start object)) 76 | (- (inflate-state-input-end object) 77 | (inflate-state-input-index object)) 78 | (- (inflate-state-output-index object) 79 | (inflate-state-output-start object)) 80 | (- (inflate-state-output-end object) 81 | (inflate-state-output-index object))))) 82 | -------------------------------------------------------------------------------- /inflate.lisp: -------------------------------------------------------------------------------- 1 | (in-package :chipz) 2 | 3 | (defun update-window (state) 4 | (declare (type inflate-state state)) 5 | (let* ((output (inflate-state-output state)) 6 | (start (inflate-state-output-start state)) 7 | (index (inflate-state-output-index state)) 8 | (n-bytes-to-copy (- index start)) 9 | (window (inflate-state-window state)) 10 | (window-index (inflate-state-window-index state))) 11 | (cond 12 | ((>= n-bytes-to-copy (length window)) 13 | ;; can "flush" the window 14 | (setf (inflate-state-window-index state) 0) 15 | (replace window output :start2 (- index (length window)) 16 | :end2 index)) 17 | (t 18 | (let ((window-space (- (length window) window-index))) 19 | (cond 20 | ((> n-bytes-to-copy window-space) 21 | (replace window output :start1 window-index 22 | :start2 start :end2 index) 23 | (replace window output 24 | :start2 (+ start window-space) 25 | :end2 index) 26 | (setf (inflate-state-window-index state) 27 | (- n-bytes-to-copy window-space))) 28 | (t 29 | (replace window output :start1 window-index 30 | :start2 start :end2 index) 31 | (setf (inflate-state-window-index state) 32 | (mod (+ window-index n-bytes-to-copy) (length window)))))))))) 33 | 34 | ;;; This is used behind-the-scenes to do efficient buffer->buffer 35 | ;;; decompression. Everything user-visible that's related to 36 | ;;; decompression ultimately comes down to this function. 37 | (defun %inflate (state input output &key (input-start 0) input-end 38 | (output-start 0) output-end) 39 | "Decompresses data in INPUT between INPUT-START and INPUT-END 40 | and places the result in OUTPUT between OUTPUT-START and 41 | OUTPUT-END. -START and -END arguments follow the convention of 42 | the sequence functions. Returns the number of bytes pulled from 43 | the input and the number of bytes written to the output." 44 | (declare (type inflate-state state)) 45 | (let* ((input-end (or input-end (length input))) 46 | (output-end (or output-end (length output)))) 47 | (setf (inflate-state-input state) input 48 | (inflate-state-input-start state) input-start 49 | (inflate-state-input-index state) input-start 50 | (inflate-state-input-end state) input-end 51 | (inflate-state-output state) output 52 | (inflate-state-output-start state) output-start 53 | (inflate-state-output-index state) output-start 54 | (inflate-state-output-end state) output-end) 55 | (catch 'inflate-done 56 | (%inflate-state-machine state)) 57 | (update-window state) 58 | (when (dstate-update-checksum state) 59 | (funcall (dstate-update-checksum state) 60 | (dstate-checksum state) output output-start 61 | (inflate-state-output-index state))) 62 | (values (- (inflate-state-input-index state) input-start) 63 | (- (inflate-state-output-index state) output-start)))) 64 | 65 | 66 | (defun record-code-length (state value) 67 | (setf (aref (inflate-state-code-lengths state) 68 | (aref *code-length-code-order* 69 | (inflate-state-n-values-read state))) value) 70 | (incf (inflate-state-n-values-read state))) 71 | 72 | 73 | ;;; internal inflate function 74 | 75 | (defun %inflate-state-machine (state) 76 | (declare (type inflate-state state)) 77 | (declare (optimize (speed 3) (debug 1) (space 0) (compilation-speed 0))) 78 | ;; Once upon a time, the individual functions in the LABELS below were 79 | ;; separate functions. We drove the state machine of this function 80 | ;; using LOOP and SYMBOL-FUNCTION. This scheme looked lovely...except 81 | ;; that SYMBOL-FUNCTION is a horrible thing to call in inner loops, 82 | ;; and we were calling it for just about every byte of input. 83 | ;; 84 | ;; So we switched to this huge LABELS. Each function then stored a 85 | ;; reference to its next state in INFLATE-STATE-STATE before jumping 86 | ;; to the next function. Some compilers were even able to optimize 87 | ;; the call into a fallthru, which provides a nice approximation of a 88 | ;; C switch statement. That was fine and dandy...except that the jump 89 | ;; is a tail call, Common Lisp is not Scheme, and some implementations 90 | ;; do not optimize tail calls. This combination led to stack 91 | ;; overflows if you handed a large input buffer to this function. 92 | ;; 93 | ;; So we provide alternatives now through the TRANSITION-TO macro. On 94 | ;; implementations we're sure we can trust to DTRT, we keep the second 95 | ;; scheme above. On other implementations, we use a variant of the 96 | ;; first scheme we tried, which is to simply store the next state's 97 | ;; function in INFLATE-STATE-STATE and return. This at least avoids 98 | ;; SYMBOL-FUNCTION and keeps constant stack space; the LOOP in the 99 | ;; body of the LABELS (waaay down there) makes sure that we don't stop 100 | ;; until we THROW. 101 | (macrolet ((transition-to (next-state) 102 | `(progn 103 | (setf (inflate-state-state state) #',next-state) 104 | #+(or sbcl cmu) 105 | (,next-state state) 106 | ;; Just fall through for other implementations and 107 | ;; return normally. 108 | ))) 109 | (labels ( 110 | (read-bits (n state) 111 | (declare (type (integer 0 32) n)) 112 | (declare (type inflate-state state)) 113 | (prog1 (ldb (byte n 0) (inflate-state-bits state)) 114 | (setf (inflate-state-bits state) 115 | (ash (inflate-state-bits state) (- n))) 116 | (decf (inflate-state-n-bits state) n))) 117 | 118 | (ensure-bits (n state) 119 | (declare (type (integer 0 32) n)) 120 | (declare (type inflate-state state)) 121 | (let ((bits (inflate-state-bits state)) 122 | (n-bits (inflate-state-n-bits state)) 123 | (input-index (inflate-state-input-index state))) 124 | (declare (type (unsigned-byte 32) bits)) 125 | (loop while (< n-bits n) 126 | when (>= input-index (inflate-state-input-end state)) 127 | do (progn 128 | (setf (inflate-state-bits state) bits 129 | (inflate-state-n-bits state) n-bits 130 | (inflate-state-input-index state) input-index) 131 | (throw 'inflate-done nil)) 132 | do (let ((byte (aref (inflate-state-input state) input-index))) 133 | (declare (type (unsigned-byte 8) byte)) 134 | (setf bits 135 | (logand #xffffffff (logior (ash byte n-bits) bits))) 136 | (incf n-bits 8) 137 | (incf input-index)) 138 | finally (setf (inflate-state-bits state) bits 139 | (inflate-state-n-bits state) n-bits 140 | (inflate-state-input-index state) input-index)))) 141 | 142 | (ensure-and-read-bits (n state) 143 | (ensure-bits n state) 144 | (read-bits n state)) 145 | 146 | (align-bits-bytewise (state) 147 | (declare (type inflate-state state)) 148 | (let ((n-bits (inflate-state-n-bits state))) 149 | (decf (inflate-state-n-bits state) (rem n-bits 8)) 150 | (setf (inflate-state-bits state) 151 | (ash (inflate-state-bits state) 152 | (- (rem n-bits 8)))) 153 | (values))) 154 | 155 | (decode-value (table state) 156 | (declare (type huffman-decode-table table)) 157 | (declare (type inflate-state state)) 158 | (declare (optimize (speed 3))) 159 | (ensure-bits (hdt-bits table) state) 160 | (let ((bits (inflate-state-bits state))) 161 | (declare (type (unsigned-byte 32) bits)) 162 | (do ((counts (hdt-counts table)) 163 | (len 1 (1+ len)) 164 | (first 0 (probably-the-fixnum (ash first 1))) 165 | (code 0 (probably-the-fixnum (ash code 1)))) 166 | ((>= len +max-code-length+) nil) 167 | (declare (type (and fixnum (integer 0 *)) first code)) 168 | ;; We would normally do this with READ-BITS, but DECODE-VALUE 169 | ;; is a hotspot in profiles along with this would-be call to 170 | ;; READ-BITS, so we inline it all here. 171 | (setf code (logior code (logand bits 1)) 172 | bits (ash bits -1)) 173 | (let ((count (aref counts len))) 174 | (when (< (- code count) first) 175 | (setf (inflate-state-bits state) bits) 176 | (decf (inflate-state-n-bits state) len) 177 | (return-from decode-value (aref (hdt-symbols table) 178 | (probably-the-fixnum 179 | (+ (aref (hdt-offsets table) (1- len)) 180 | (- code first)))))) 181 | (setf first 182 | (probably-the-fixnum (+ first count))))))) 183 | 184 | (read-dynamic-table (state decoder n-values) 185 | (declare (type inflate-state state)) 186 | (loop with lengths = (inflate-state-code-lengths state) 187 | while (< (inflate-state-n-values-read state) n-values) 188 | do (ensure-bits (+ (hdt-bits decoder) 7) state) 189 | (let ((value (decode-value decoder state))) 190 | (cond 191 | ((< value 16) 192 | (setf (aref lengths (inflate-state-n-values-read state)) value) 193 | (incf (inflate-state-n-values-read state))) 194 | (t 195 | (let ((len 0) (sym 0)) 196 | (cond 197 | ((= value 16) 198 | (setf sym (aref lengths (1- (inflate-state-n-values-read state)))) 199 | (setf len (+ 3 (read-bits 2 state)))) 200 | ((= value 17) 201 | (setf len (+ 3 (read-bits 3 state)))) 202 | ((= value 18) 203 | (setf len (+ 11 (read-bits 7 state))))) 204 | (fill lengths sym :start (inflate-state-n-values-read state) 205 | :end (+ (inflate-state-n-values-read state) len)) 206 | (incf (inflate-state-n-values-read state) len))))) 207 | finally (progn 208 | (assert (= n-values (inflate-state-n-values-read state))) 209 | (return (construct-huffman-decode-table lengths n-values))))) 210 | 211 | ;; Basic starter functions. 212 | (done (state) 213 | (declare (ignore state)) 214 | (throw 'inflate-done t)) 215 | 216 | (block-type (state) 217 | (cond 218 | ((inflate-state-final-block-p state) 219 | (align-bits-bytewise state) 220 | (setf (inflate-state-state state) 221 | (ecase (inflate-state-data-format state) 222 | (deflate 223 | (setf (inflate-state-done state) t) 224 | #'done) 225 | (zlib #'check-zlib-adler32) 226 | (gzip #'gzip-crc32)))) 227 | (t 228 | (ensure-bits 3 state) 229 | (setf (inflate-state-final-block-p state) (= 1 (read-bits 1 state))) 230 | (ecase (read-bits 2 state) 231 | (#.+block-no-compress+ 232 | (transition-to uncompressed-block)) 233 | (#.+block-fixed-codes+ 234 | (setf (inflate-state-literal/length-table state) 235 | *fixed-literal/length-table* 236 | (inflate-state-distance-table state) 237 | *fixed-distance-table*) 238 | (transition-to literal/length)) 239 | (#.+block-dynamic-codes+ 240 | (transition-to dynamic-tables)) 241 | (#.+block-invalid+ 242 | (error 'reserved-block-type-error)))))) 243 | 244 | ;;; processing uncompressed blocks 245 | 246 | (uncompressed-block (state) 247 | (align-bits-bytewise state) 248 | (let* ((len (ensure-and-read-bits 16 state)) 249 | (nlen (ensure-and-read-bits 16 state))) 250 | (unless (zerop (logand len nlen)) 251 | ;; Apparently Adobe's PDF generator(s) get this wrong, so let the 252 | ;; user continue on if they choose to do so. 253 | (cerror "Use the invalid stored block length." 254 | 'invalid-stored-block-length-error)) 255 | (setf (inflate-state-length state) len) 256 | (transition-to copy-bytes))) 257 | 258 | (copy-bytes (state) 259 | (declare (type inflate-state state)) 260 | (if (zerop (inflate-state-length state)) 261 | (setf (inflate-state-state state) #'block-type) 262 | (let ((n-copied-bytes (min (inflate-state-length state) 263 | (- (inflate-state-input-end state) 264 | (inflate-state-input-index state)) 265 | (- (inflate-state-output-end state) 266 | (inflate-state-output-index state))))) 267 | (cond 268 | ((zerop n-copied-bytes) (throw 'inflate-done nil)) 269 | (t 270 | (replace (inflate-state-output state) 271 | (inflate-state-input state) 272 | :start1 (inflate-state-output-index state) 273 | :end1 (+ (inflate-state-output-index state) 274 | n-copied-bytes) 275 | :start2 (inflate-state-input-index state) 276 | :end2 (+ (inflate-state-input-index state) 277 | n-copied-bytes)) 278 | (incf (inflate-state-input-index state) n-copied-bytes) 279 | (incf (inflate-state-output-index state) n-copied-bytes) 280 | (decf (inflate-state-length state) n-copied-bytes))))) 281 | (values)) 282 | 283 | ;;; dynamic block compression tables 284 | 285 | (dynamic-tables (state) 286 | (declare (type inflate-state state)) 287 | (ensure-bits 14 state) 288 | (setf (inflate-state-n-length-codes state) (+ (read-bits 5 state) 257) 289 | (inflate-state-n-distance-codes state) (+ (read-bits 5 state) 1) 290 | (inflate-state-n-codes state) (+ (read-bits 4 state) 4) 291 | (inflate-state-n-values-read state) 0) 292 | (transition-to dynamic-code-lengths)) 293 | 294 | (dynamic-code-lengths (state) 295 | (declare (type inflate-state state)) 296 | (loop while (< (inflate-state-n-values-read state) 297 | (inflate-state-n-codes state)) 298 | do (ensure-bits 3 state) 299 | (record-code-length state (read-bits 3 state))) 300 | (loop while (< (inflate-state-n-values-read state) +max-n-code-lengths+) 301 | do (record-code-length state 0)) 302 | (setf (inflate-state-codes-table state) 303 | (construct-huffman-decode-table (inflate-state-code-lengths state) 304 | +max-n-code-lengths+) 305 | (inflate-state-n-values-read state) 0) 306 | (transition-to dynamic-literal/length-table)) 307 | 308 | (dynamic-literal/length-table (state) 309 | (declare (type inflate-state state)) 310 | (setf (inflate-state-literal/length-table state) 311 | (read-dynamic-table state (inflate-state-codes-table state) 312 | (inflate-state-n-length-codes state)) 313 | (inflate-state-n-values-read state) 0) 314 | (transition-to dynamic-distance-table)) 315 | 316 | (dynamic-distance-table (state) 317 | (declare (type inflate-state state)) 318 | (setf (inflate-state-distance-table state) 319 | (read-dynamic-table state (inflate-state-codes-table state) 320 | (inflate-state-n-distance-codes state))) 321 | (transition-to literal/length)) 322 | 323 | ;;; normal operation on compressed blocks 324 | 325 | (literal/length (state) 326 | (declare (type inflate-state state)) 327 | (let ((value (decode-value (inflate-state-literal/length-table state) 328 | state))) 329 | (declare (type (integer 0 288) value)) 330 | (cond 331 | ((< value 256) 332 | (setf (inflate-state-length state) value) 333 | (transition-to literal)) 334 | ((> value 256) 335 | (setf (inflate-state-length-code state) (- value 257)) 336 | (transition-to length-code)) 337 | (t #+nil (= value 256) 338 | (transition-to block-type))))) 339 | 340 | (literal (state) 341 | (declare (type inflate-state state)) 342 | (cond 343 | ((= (inflate-state-output-index state) 344 | (inflate-state-output-end state)) (throw 'inflate-done nil)) 345 | (t (setf (aref (inflate-state-output state) 346 | (inflate-state-output-index state)) 347 | (inflate-state-length state)) 348 | (incf (inflate-state-output-index state)) 349 | (transition-to literal/length)))) 350 | 351 | (length-code (state) 352 | (declare (type inflate-state state)) 353 | (let* ((length-code (inflate-state-length-code state)) 354 | (length-extra (ensure-and-read-bits (n-length-extra-bits length-code) state))) 355 | (setf (inflate-state-length state) 356 | (+ (length-base length-code) length-extra)) 357 | (transition-to distance))) 358 | 359 | (distance (state) 360 | (declare (type inflate-state state)) 361 | (let ((value (decode-value (inflate-state-distance-table state) 362 | state))) 363 | (setf (inflate-state-distance state) value) 364 | (transition-to distance-extra))) 365 | 366 | (distance-extra (state) 367 | (declare (type inflate-state state)) 368 | (let* ((bits (n-distance-extra-bits (inflate-state-distance state))) 369 | (distance-extra (if (zerop bits) 370 | 0 371 | (ensure-and-read-bits bits state)))) 372 | (setf (inflate-state-distance state) 373 | (+ (distance-base (inflate-state-distance state)) distance-extra)) 374 | (transition-to copy-match))) 375 | 376 | (copy-match (state) 377 | (declare (type inflate-state state)) 378 | (let* ((distance (inflate-state-distance state)) 379 | (length (inflate-state-length state)) 380 | (start (inflate-state-output-start state)) 381 | (index (inflate-state-output-index state)) 382 | (end (inflate-state-output-end state)) 383 | (window-index (inflate-state-window-index state)) 384 | (n-bytes-to-copy (min length (- end index)))) 385 | (when (= index end) 386 | (throw 'inflate-done nil)) 387 | (flet ((frob-by-copying-from (copy-source copy-index n-bytes-to-copy) 388 | (declare (type (simple-array (unsigned-byte 8) (*)) copy-source)) 389 | (decf (inflate-state-length state) n-bytes-to-copy) 390 | (incf (inflate-state-output-index state) n-bytes-to-copy) 391 | (loop with output = (inflate-state-output state) 392 | for i from index below (the fixnum (+ index n-bytes-to-copy)) 393 | for j from copy-index below (the fixnum (+ copy-index n-bytes-to-copy)) 394 | do (setf (aref output i) (aref copy-source j))))) 395 | (cond 396 | ((<= distance (- index start)) 397 | ;; we are within the output we have produced 398 | (frob-by-copying-from (inflate-state-output state) 399 | (- index distance) 400 | n-bytes-to-copy)) 401 | (t 402 | (let ((copy-index (+ (- window-index distance) (- index start)))) 403 | (cond 404 | ((not (minusp copy-index)) 405 | ;; we are within the non-wraparound portion of the window 406 | ;; 407 | ;; can only copy up to the window's index, though 408 | (let ((n-bytes-to-copy (min n-bytes-to-copy (- window-index copy-index)))) 409 | (frob-by-copying-from (inflate-state-window state) 410 | copy-index 411 | n-bytes-to-copy))) 412 | (t 413 | ;; we are within the wraparound portion of the window 414 | (let* ((copy-index (+ copy-index 415 | (length (inflate-state-window state)))) 416 | (n-bytes-to-copy (min n-bytes-to-copy 417 | (- (length (inflate-state-window state)) 418 | copy-index)))) 419 | (frob-by-copying-from (inflate-state-window state) 420 | copy-index 421 | n-bytes-to-copy))))))) 422 | (when (zerop (inflate-state-length state)) 423 | (transition-to literal/length))))) 424 | 425 | ;; GZIP 426 | (gzip-header-id (state) 427 | (declare (type inflate-state state)) 428 | (let ((header-field (ensure-and-read-bits 16 state))) 429 | (unless (and (= (ldb (byte 8 0) header-field) #x1f) 430 | (= (ldb (byte 8 8) header-field) #x8b)) 431 | (error 'invalid-gzip-header-error)) 432 | (transition-to gzip-cm))) 433 | 434 | (gzip-cm (state) 435 | (declare (type inflate-state state)) 436 | (let ((cm-byte (ensure-and-read-bits 8 state))) 437 | (setf (inflate-state-header state) 438 | (make-instance 'gzip-header :compression-method cm-byte)) 439 | (transition-to gzip-flags))) 440 | 441 | (gzip-flags (state) 442 | (declare (type inflate-state state)) 443 | (let ((flags-byte (ensure-and-read-bits 8 state))) 444 | (setf (flags (inflate-state-header state)) flags-byte) 445 | (transition-to gzip-mtime))) 446 | 447 | (gzip-mtime (state) 448 | (declare (type inflate-state state)) 449 | (let ((mtime (ensure-and-read-bits 32 state))) 450 | (setf (mtime (inflate-state-header state)) mtime) 451 | (transition-to gzip-xfl))) 452 | 453 | (gzip-xfl (state) 454 | (declare (type inflate-state state)) 455 | (let ((xfl-byte (ensure-and-read-bits 8 state))) 456 | (setf (extra-flags (inflate-state-header state)) xfl-byte) 457 | (transition-to gzip-os))) 458 | 459 | (gzip-os (state) 460 | (declare (type inflate-state state)) 461 | (let ((os-byte (ensure-and-read-bits 8 state))) 462 | (setf (os (inflate-state-header state)) os-byte) 463 | (transition-to gzip-xlen-len))) 464 | 465 | (gzip-xlen-len (state) 466 | (declare (type inflate-state state)) 467 | (let ((flags (flags (inflate-state-header state)))) 468 | (cond 469 | ((logbitp +gzip-flag-extra+ flags) 470 | (error "gzip extra field not supported yet")) 471 | (t 472 | (transition-to gzip-fname))))) 473 | 474 | (gzip-fname (state) 475 | (declare (type inflate-state state)) 476 | (process-gzip-zero-terminated-field state +gzip-flag-name+ 477 | #'filename #'(setf filename) 478 | #'gzip-fcomment)) 479 | 480 | (gzip-fcomment (state) 481 | (declare (type inflate-state state)) 482 | (process-gzip-zero-terminated-field state +gzip-flag-comment+ 483 | #'comment #'(setf comment) 484 | #'gzip-crc16)) 485 | 486 | (process-gzip-zero-terminated-field (state control-bit 487 | slot set-slot 488 | next-state) 489 | (let ((header (inflate-state-header state))) 490 | (cond 491 | ((logbitp control-bit (flags header)) 492 | (let ((byte (ensure-and-read-bits 8 state))) 493 | (cond 494 | ((zerop byte) 495 | ;; the end, convert to sane form 496 | (funcall set-slot 497 | (coerce (funcall slot header) 498 | '(vector (unsigned-byte 8))) 499 | header) 500 | (setf (inflate-state-state state) next-state)) 501 | (t 502 | ;; wish we could use PUSH here 503 | (funcall set-slot 504 | (cons byte (funcall slot header)) 505 | header))))) 506 | (t 507 | (setf (inflate-state-state state) next-state))) 508 | (values))) 509 | 510 | (gzip-crc16 (state) 511 | (declare (type inflate-state state)) 512 | (let ((header (inflate-state-header state))) 513 | (when (logbitp +gzip-flag-crc+ (flags header)) 514 | (let ((crc16 (ensure-and-read-bits 16 state))) 515 | ;; FIXME: would be good to perform integrity checking here 516 | (declare (ignore crc16)))) 517 | (transition-to block-type))) 518 | 519 | (gzip-crc32 (state) 520 | (declare (type inflate-state state)) 521 | (let ((stored (ensure-and-read-bits 32 state)) 522 | (crc32 (copy-crc32 (inflate-state-checksum state)))) 523 | (update-crc32 crc32 524 | (inflate-state-output state) 525 | (inflate-state-output-start state) 526 | (inflate-state-output-index state)) 527 | (unless (= stored (produce-crc32 crc32)) 528 | (error 'invalid-checksum-error 529 | :stored stored 530 | :computed (produce-crc32 crc32) 531 | :kind :crc32)) 532 | (transition-to gzip-isize))) 533 | 534 | (gzip-isize (state) 535 | (declare (type inflate-state state)) 536 | (let ((isize (ensure-and-read-bits 32 state))) 537 | (declare (ignore isize)) 538 | (setf (inflate-state-done state) t) 539 | (transition-to done))) 540 | 541 | ;; ZLIB 542 | (zlib-cmf (state) 543 | (declare (type inflate-state state)) 544 | (let ((cmf-byte (ensure-and-read-bits 8 state))) 545 | (setf (inflate-state-header state) 546 | (make-instance 'zlib-header :cmf cmf-byte)) 547 | (transition-to zlib-flags))) 548 | 549 | (zlib-flags (state) 550 | (declare (type inflate-state state)) 551 | (let ((flags-byte (ensure-and-read-bits 8 state)) 552 | (header (inflate-state-header state))) 553 | ;; check 554 | (unless (zerop (mod (+ (* (cmf header) 256) flags-byte) 31)) 555 | (error 'invalid-zlib-header-error)) 556 | (setf (flags header) flags-byte) 557 | (transition-to zlib-fdict))) 558 | 559 | (zlib-fdict (state) 560 | (declare (type inflate-state state)) 561 | (let* ((header (inflate-state-header state)) 562 | (flags-byte (flags header))) 563 | (when (logbitp +zlib-flag-fdict+ flags-byte) 564 | (let ((fdict (ensure-and-read-bits 32 state))) 565 | (setf (fdict header) fdict))) 566 | (transition-to block-type))) 567 | 568 | (check-zlib-adler32 (state) 569 | (declare (type inflate-state state)) 570 | (let ((stored (let ((x (ensure-and-read-bits 32 state))) 571 | (logior (ash (ldb (byte 8 0) x) 24) 572 | (ash (ldb (byte 8 8) x) 16) 573 | (ash (ldb (byte 8 16) x) 8) 574 | (ldb (byte 8 24) x)))) 575 | (adler32 (copy-adler32 (inflate-state-checksum state)))) 576 | (update-adler32 adler32 577 | (inflate-state-output state) 578 | (inflate-state-output-start state) 579 | (inflate-state-output-index state)) 580 | (unless (= stored 581 | (produce-adler32 adler32)) 582 | (error 'invalid-checksum-error 583 | :stored stored 584 | :computed (produce-adler32 adler32) 585 | :kind :adler32)) 586 | (setf (inflate-state-done state) t) 587 | (transition-to done))) 588 | ) 589 | (unless (inflate-state-state state) 590 | (setf (inflate-state-state state) 591 | (ecase (inflate-state-data-format state) 592 | (deflate #'block-type) 593 | (zlib #'zlib-cmf) 594 | (gzip #'gzip-header-id)))) 595 | (loop (funcall (inflate-state-state state) state))))) 596 | -------------------------------------------------------------------------------- /package.lisp: -------------------------------------------------------------------------------- 1 | (defpackage :chipz 2 | (:use :cl) 3 | (:export #:decompression-state 4 | #:inflate-state 5 | #:bzip2-state 6 | 7 | #:make-dstate 8 | #:finish-dstate 9 | 10 | ;; Only for API compatibility 11 | #:make-inflate-state 12 | #:finish-inflate-state 13 | 14 | ;; Main user-visible entry point 15 | #:decompress 16 | 17 | ;; Symbols for EQL specializers 18 | #:deflate 19 | #:zlib 20 | #:gzip 21 | #:bzip2 22 | 23 | ;; Gray streams 24 | #:make-decompressing-stream 25 | 26 | ;; conditions 27 | 28 | #:chipz-error 29 | #:invalid-format-error 30 | #:decompression-error 31 | #:invalid-checksum-error 32 | #:premature-end-of-stream 33 | #:inflate-error 34 | #:invalid-zlib-header-error 35 | #:invalid-gzip-header-error 36 | #:reserved-block-type-error 37 | #:invalid-stored-block-length-error 38 | #:bzip2-error 39 | #:invalid-bzip2-data)) 40 | -------------------------------------------------------------------------------- /stream-fallback.lisp: -------------------------------------------------------------------------------- 1 | ;;;; stream-fallback.lisp -- loaded when there is no support for gray streams 2 | 3 | (in-package :chipz) 4 | 5 | (defun make-decompressing-stream (format stream) 6 | (declare (ignore format stream)) 7 | (error "make-decompressing-stream is not supported for this lisp implementation")) -------------------------------------------------------------------------------- /stream.lisp: -------------------------------------------------------------------------------- 1 | ;;;; stream.lisp -- gray stream wrappers for INFLATE 2 | 3 | (in-package :chipz) 4 | 5 | (eval-when (:compile-toplevel :load-toplevel) 6 | #-chipz-system:gray-streams 7 | (error "gray streams are not supported in this lisp implementation")) 8 | 9 | ;;; portability definitions 10 | 11 | #+ecl 12 | (eval-when (:compile-toplevel :load-toplevel :execute) 13 | (gray::redefine-cl-functions)) 14 | 15 | #+cmu 16 | (eval-when (:compile-toplevel :load-toplevel :execute) 17 | (require :gray-streams)) 18 | 19 | ;;; TRIVIAL-GRAY-STREAMS has it, we might as well, too... 20 | #+allegro 21 | (eval-when (:compile-toplevel :load-toplevel :execute) 22 | (unless (fboundp 'excl:stream-write-string) 23 | (require "streamc.fasl"))) 24 | 25 | (eval-when (:compile-toplevel :load-toplevel :execute) 26 | (defvar *binary-input-stream-class* 27 | #+lispworks 'stream:fundamental-binary-input-stream 28 | #+sbcl 'sb-gray:fundamental-binary-input-stream 29 | #+openmcl 'gray:fundamental-binary-input-stream 30 | #+cmu 'ext:fundamental-binary-input-stream 31 | #+allegro 'excl:fundamental-binary-input-stream 32 | #+clisp 'gray:fundamental-binary-input-stream 33 | #+ecl 'gray:fundamental-binary-input-stream) 34 | 35 | (defvar *stream-read-byte-function* 36 | #+lispworks 'stream:stream-read-byte 37 | #+sbcl 'sb-gray:stream-read-byte 38 | #+openmcl 'gray:stream-read-byte 39 | #+cmu 'ext:stream-read-byte 40 | #+allegro 'excl:stream-read-byte 41 | #+clisp 'gray:stream-read-byte 42 | #+ecl 'gray:stream-read-byte) 43 | 44 | (defvar *stream-read-sequence-function* 45 | #+lispworks 'stream:stream-read-sequence 46 | #+sbcl 'sb-gray:stream-read-sequence 47 | #+openmcl 'ccl:stream-read-vector 48 | #+cmu 'ext:stream-read-sequence 49 | #+allegro 'excl:stream-read-sequence 50 | #+clisp 'gray:stream-read-byte-sequence 51 | #+ecl 'gray:stream-read-sequence) 52 | ) ; EVAL-WHEN 53 | 54 | ;;; READ-SEQUENCE 55 | 56 | (defmacro define-stream-read-sequence (specializer &body body) 57 | (let ((definition 58 | `(cond 59 | ((not (typep seq 'simple-octet-vector)) 60 | (call-next-method)) 61 | (t 62 | (let ((end (or end (length seq)))) 63 | ,@body))))) 64 | 65 | #+(or cmu sbcl allegro ecl) 66 | `(defmethod #.*stream-read-sequence-function* ((stream ,specializer) seq &optional (start 0) end) 67 | ,definition) 68 | 69 | #+(or lispworks openmcl) 70 | `(defmethod #.*stream-read-sequence-function* ((stream ,specializer) seq start end) 71 | ,definition) 72 | 73 | #+clisp 74 | `(defmethod #.*stream-read-sequence-function* ((stream ,specializer) seq 75 | &optional (start 0) end 76 | ,(gensym "no-hang") 77 | ,(gensym "interactive")) 78 | ,definition))) 79 | 80 | ;;; class definition 81 | 82 | (defclass decompressing-stream (#.*binary-input-stream-class*) 83 | ((wrapped-stream :initarg :stream :reader wrapped-stream) 84 | (dstate :initarg :dstate :reader dstate) 85 | (dfun :initarg :dfun :reader dfun) 86 | (input-buffer :initform (make-array 4096 :element-type '(unsigned-byte 8)) 87 | :reader input-buffer) 88 | (input-buffer-index :initform 0 :accessor input-buffer-index) 89 | (input-buffer-n-bytes :initform 0 :accessor input-buffer-n-bytes) 90 | (output-buffer :initform (make-array 4096 :element-type '(unsigned-byte 8)) 91 | :reader output-buffer) 92 | (output-buffer-index :initform 0 :accessor output-buffer-index) 93 | (output-buffer-n-bytes :initform 0 :accessor output-buffer-n-bytes))) 94 | 95 | ;;; constructors 96 | (defun make-decompressing-stream (format stream) 97 | (multiple-value-bind (state dfun) 98 | (ecase format 99 | ((:deflate :zlib :gzip deflate zlib gzip) 100 | (values (make-inflate-state format) #'%inflate)) 101 | ((:bzip2 bzip2) 102 | (values (make-bzip2-state) #'%bzip2-decompress))) 103 | (make-instance 'decompressing-stream 104 | :stream stream 105 | :dstate state 106 | :dfun dfun))) 107 | 108 | 109 | ;;; stream management 110 | 111 | (defun output-available-p (stream) 112 | (/= (output-buffer-index stream) (output-buffer-n-bytes stream))) 113 | 114 | (defun input-available-p (stream) 115 | (/= (input-buffer-index stream) (input-buffer-n-bytes stream))) 116 | 117 | (defun refill-stream-input-buffer (stream) 118 | (with-slots (input-buffer wrapped-stream 119 | input-buffer-index input-buffer-n-bytes) 120 | stream 121 | (let ((n-bytes-read (read-sequence input-buffer wrapped-stream))) 122 | (setf input-buffer-index 0 input-buffer-n-bytes n-bytes-read) 123 | #+nil 124 | (format *trace-output* "index: ~D | n-bytes ~D~%" 125 | input-buffer-index input-buffer-n-bytes) 126 | (values)))) 127 | 128 | (defun refill-stream-output-buffer (stream) 129 | (unless (input-available-p stream) 130 | (refill-stream-input-buffer stream)) 131 | (multiple-value-bind (bytes-read bytes-output) 132 | (funcall (the function (dfun stream)) 133 | (dstate stream) 134 | (input-buffer stream) 135 | (output-buffer stream) 136 | :input-start (input-buffer-index stream) 137 | :input-end (input-buffer-n-bytes stream)) 138 | (setf (output-buffer-index stream) 0 139 | (output-buffer-n-bytes stream) bytes-output 140 | (input-buffer-index stream) (+ (input-buffer-index stream) bytes-read)) 141 | (assert (<= (input-buffer-index stream) (input-buffer-n-bytes stream))))) 142 | 143 | 144 | ;;; methods 145 | 146 | (defun read-and-decompress-byte (stream) 147 | (flet ((maybe-done () 148 | (when (output-available-p stream) 149 | (return-from read-and-decompress-byte 150 | (aref (output-buffer stream) 151 | (prog1 (output-buffer-index stream) 152 | (incf (output-buffer-index stream)))))))) 153 | ;; several input buffers may be used up before output is available 154 | ;; => read-byte should refill "something" while at all possible, 155 | ;; like read-sequence already does. 156 | (loop initially (maybe-done) 157 | do (refill-stream-output-buffer stream) 158 | (maybe-done) 159 | (unless (input-available-p stream) 160 | (refill-stream-input-buffer stream)) 161 | ;; If we didn't refill, then we must be all done. 162 | (unless (input-available-p stream) 163 | (finish-dstate (dstate stream)) 164 | (return :eof))))) 165 | 166 | (defun copy-existing-output (stream seq start end) 167 | (declare (type simple-octet-vector seq)) 168 | (let ((amount (min (- end start) 169 | (- (output-buffer-n-bytes stream) 170 | (output-buffer-index stream))))) 171 | (replace seq (output-buffer stream) 172 | :start1 start :end1 end 173 | :start2 (output-buffer-index stream) 174 | :end2 (output-buffer-n-bytes stream)) 175 | (incf (output-buffer-index stream) amount) 176 | (+ start amount))) 177 | 178 | (define-stream-read-sequence decompressing-stream 179 | (unless (typep seq 'simple-octet-vector) 180 | (return-from #.*stream-read-sequence-function* (call-next-method))) 181 | (loop initially (when (output-available-p stream) 182 | (setf start (copy-existing-output stream seq 183 | start end))) 184 | while (< start end) 185 | do (unless (input-available-p stream) 186 | (refill-stream-input-buffer stream)) 187 | ;; If we didn't refill, then we must be all done. 188 | (unless (input-available-p stream) 189 | (finish-dstate (dstate stream)) 190 | (loop-finish)) 191 | ;; Decompress directly into the user-provided buffer. 192 | (multiple-value-bind (bytes-read bytes-output) 193 | (funcall (the function (dfun stream)) 194 | (dstate stream) 195 | (input-buffer stream) 196 | seq 197 | :input-start (input-buffer-index stream) 198 | :input-end (input-buffer-n-bytes stream) 199 | :output-start start 200 | :output-end end) 201 | (incf (input-buffer-index stream) bytes-read) 202 | (incf start bytes-output)) 203 | finally (return start))) 204 | 205 | (defmethod #.*stream-read-byte-function* ((stream decompressing-stream)) 206 | (read-and-decompress-byte stream)) 207 | -------------------------------------------------------------------------------- /tests.lisp: -------------------------------------------------------------------------------- 1 | ;; running the tests requires 2 | ;; - ASDF 3 | ;; - the salza2 library (http://www.xach.com/lisp/salza2) 4 | ;; - "bzip2" program (in PATH) 5 | ;; how to run the tests: 6 | ;; - load this file, tests.lisp, into your lisp environment 7 | ;; - create an empty directory, known hereafter as 8 | ;; - put some files into the directory, their extension shall be "uncompressed" 9 | ;; - compress the test files by running (chipz-tests:compress-test-files ) 10 | ;; you only need to do this once 11 | ;; - execute (chipz-tests:run-all-tests ) 12 | 13 | (asdf:oos 'asdf:load-op "chipz") 14 | (asdf:oos 'asdf:load-op "salza2") 15 | 16 | (defpackage :chipz-tests 17 | (:use :cl :chipz) 18 | (:export #:run-all-tests #:compress-test-files)) 19 | 20 | (in-package :chipz-tests) 21 | 22 | (defun test/whole-file (format compressed-pathname original-pathname) 23 | (with-open-file (compressed-stream compressed-pathname :direction :input 24 | :element-type '(unsigned-byte 8)) 25 | (with-open-file (stream original-pathname :direction :input 26 | :element-type '(unsigned-byte 8)) 27 | (let ((compressed-input (make-array (file-length compressed-stream) 28 | :element-type '(unsigned-byte 8))) 29 | (output (make-array (file-length stream) 30 | :element-type '(unsigned-byte 8))) 31 | (original (make-array (file-length stream) 32 | :element-type '(unsigned-byte 8))) 33 | (zstream (make-dstate format))) 34 | (let ((compressed-bytes (read-sequence compressed-input compressed-stream))) 35 | (read-sequence original stream) 36 | (multiple-value-bind (bytes-read bytes-output) 37 | (decompress output zstream compressed-input :input-end compressed-bytes) 38 | (and (= bytes-read compressed-bytes) 39 | (= bytes-output (length original)) 40 | (not (mismatch output original :end1 bytes-output 41 | :end2 bytes-output))))))))) 42 | 43 | (defun test/whole-file-cons (format compressed-pathname original-pathname) 44 | (with-open-file (compressed-stream compressed-pathname :direction :input 45 | :element-type '(unsigned-byte 8)) 46 | (with-open-file (stream original-pathname :direction :input 47 | :element-type '(unsigned-byte 8)) 48 | (let ((compressed-input (make-array (file-length compressed-stream) 49 | :element-type '(unsigned-byte 8))) 50 | (original (make-array (file-length stream) 51 | :element-type '(unsigned-byte 8)))) 52 | (read-sequence compressed-input compressed-stream) 53 | (let ((output (decompress nil format compressed-input))) 54 | (read-sequence original stream) 55 | (and (= (length original) (length output)) 56 | (not (mismatch output original)))))))) 57 | 58 | (defun test/incremental-file (format compressed-pathname original-pathname) 59 | (with-open-file (compressed-stream compressed-pathname :direction :input 60 | :element-type '(unsigned-byte 8)) 61 | (with-open-file (stream original-pathname :direction :input 62 | :element-type '(unsigned-byte 8)) 63 | (let ((compressed-input (make-array (file-length compressed-stream) 64 | :element-type '(unsigned-byte 8))) 65 | (output (make-array (file-length stream) 66 | :element-type '(unsigned-byte 8))) 67 | (original (make-array (file-length stream) 68 | :element-type '(unsigned-byte 8))) 69 | (zstream (make-dstate format))) 70 | (read-sequence original stream) 71 | (let ((compressed-bytes (read-sequence compressed-input compressed-stream)) 72 | (input-index 0) 73 | (output-index 0)) 74 | (loop 75 | (multiple-value-bind (bytes-read bytes-output) 76 | (decompress output zstream compressed-input 77 | :input-start input-index 78 | :input-end compressed-bytes 79 | :output-start output-index 80 | :output-end (1+ output-index)) 81 | (when (zerop bytes-output) (return)) 82 | (let ((ouch (mismatch original output 83 | :start1 output-index :start2 output-index 84 | :end1 (1+ output-index) :end2 (1+ output-index)))) 85 | (when ouch 86 | (return nil))) 87 | (incf input-index bytes-read) 88 | (incf output-index))) 89 | (and (= input-index compressed-bytes)) 90 | (= output-index (length original)) 91 | (not (mismatch output original :end1 output-index 92 | :end2 output-index))))))) 93 | 94 | #+chipz-system:gray-streams 95 | (defun test/gray-stream-read-sequence (format compressed-pathname original-pathname) 96 | (with-open-file (compressed-stream compressed-pathname :direction :input 97 | :element-type '(unsigned-byte 8)) 98 | (with-open-file (stream original-pathname :direction :input 99 | :element-type '(unsigned-byte 8)) 100 | (let ((zstream (make-decompressing-stream format compressed-stream)) 101 | (output (make-array (file-length stream) 102 | :element-type '(unsigned-byte 8))) 103 | (original (make-array (file-length stream) 104 | :element-type '(unsigned-byte 8)))) 105 | (read-sequence output zstream) 106 | (read-sequence original stream) 107 | (not (mismatch output original)))))) 108 | 109 | #+chipz-system:gray-streams 110 | (defun test/gray-stream-read-byte (format compressed-pathname original-pathname) 111 | (with-open-file (compressed-stream compressed-pathname :direction :input 112 | :element-type '(unsigned-byte 8)) 113 | (with-open-file (stream original-pathname :direction :input 114 | :element-type '(unsigned-byte 8)) 115 | (let ((zstream (make-decompressing-stream format compressed-stream)) 116 | (output (make-array (file-length stream) 117 | :element-type '(unsigned-byte 8))) 118 | (original (make-array (file-length stream) 119 | :element-type '(unsigned-byte 8)))) 120 | (loop for i from 0 below (file-length stream) do 121 | (progn 122 | (setf (aref output i) (read-byte zstream)) 123 | (setf (aref original i) (read-byte stream)))) 124 | (not (mismatch output original)))))) 125 | 126 | (defparameter *default-test-files-dir* 127 | (make-pathname 128 | :directory (append (pathname-directory *LOAD-TRUENAME*) '("test-files")) 129 | :device (pathname-device *LOAD-TRUENAME*) 130 | :host (pathname-host *LOAD-TRUENAME*))) 131 | 132 | (defparameter *test-functions* 133 | (list 'test/whole-file 134 | 'test/whole-file-cons 135 | 'test/incremental-file 136 | #+chipz-system:gray-streams 'test/gray-stream-read-sequence 137 | #+chipz-system:gray-streams 'test/gray-stream-read-byte)) 138 | 139 | (defparameter *formats* 140 | '(gzip zlib deflate bzip2)) 141 | 142 | (defmacro dolist/every ((var list-form) &body body) 143 | (let ((all-ok (gensym))) 144 | `(reduce 145 | (lambda (,all-ok ,var) (and (progn ,@body) ,all-ok)) 146 | ,list-form :initial-value t))) 147 | 148 | (defun run-all-tests (&optional (test-files-dir *default-test-files-dir*)) 149 | (labels ((run-test (testfun format uncompressed-file) 150 | (let ((compressed (make-pathname :type (symbol-name format) 151 | :defaults uncompressed-file))) 152 | (format t "; ~A ~A~%" (symbol-name testfun) compressed) 153 | (with-simple-restart (skip-test "skip ~A ~A" (symbol-name testfun) compressed) 154 | (assert (probe-file compressed)) 155 | (let* ((begin (get-internal-run-time)) 156 | (result (funcall testfun format compressed uncompressed-file)) 157 | (end (get-internal-run-time)) 158 | (secs (/ (- end begin) internal-time-units-per-second))) 159 | (if result 160 | (format t "; PASSED (~4$ seconds)~%" secs) 161 | (format t "; FAILED (~4$ seconds) ~A~%" secs result)) 162 | result))))) 163 | (let* ((uncompressed (make-pathname :name :wild :type "uncompressed" 164 | :defaults test-files-dir))) 165 | (dolist/every (testfun *test-functions*) 166 | (dolist/every (format *formats*) 167 | (dolist/every (file (directory uncompressed)) 168 | (run-test testfun format file))))))) 169 | 170 | (defun run-salza2 (compressor-class input-file output-file) 171 | (with-open-file (in-stream input-file :element-type '(unsigned-byte 8)) 172 | (with-open-file (out-stream output-file :element-type '(unsigned-byte 8) 173 | :direction :output 174 | :if-exists :supersede) 175 | (let ((buffer (make-array 100000 :element-type '(unsigned-byte 8))) 176 | (callback (salza2:make-stream-output-callback out-stream))) 177 | (salza2:with-compressor (comp compressor-class :callback callback) 178 | (loop 179 | (let ((bytes-read (read-sequence buffer in-stream))) 180 | (if (zerop bytes-read) 181 | (return) 182 | (salza2:compress-octet-vector buffer comp :end bytes-read))))))))) 183 | 184 | (defun run-external (output-file executable &rest args) 185 | #+lispworks 186 | (system:run-shell-command ;; cmd argv[0] argv[1..] 187 | (map 'vector #'identity (list* executable executable args)) 188 | :output output-file :if-output-exists :supersede) 189 | #+sbcl 190 | (sb-ext:run-program 191 | executable args :search t :output output-file :if-output-exists :supersede) 192 | #+openmcl 193 | (ccl:run-program 194 | executable args :output output-file :if-output-exists :supersede) 195 | #+cmu 196 | (extensions:run-program 197 | executable args :output output-file :if-output-exists :supersede) 198 | #+clisp 199 | (ext:run-program 200 | executable :arguments args :output output-file :if-output-exists :overwrite) 201 | #+ecl 202 | (ext:run-program 203 | executable args :output output-file :if-output-exists :supersede) 204 | #-(or lispworks sbcl openmcl cmu clisp ecl) 205 | (error "run-external is not supported for this lisp implementation")) 206 | 207 | (defun compress-test-files (&optional (test-files-dir *default-test-files-dir*)) 208 | (let ((uncompressed (make-pathname :name :wild :type "uncompressed" 209 | :defaults test-files-dir))) 210 | (dolist (input (directory uncompressed)) 211 | (format t "; compressing ~A~%" input) 212 | (dolist (format *formats*) 213 | (let ((output (make-pathname :type (symbol-name format) :defaults input))) 214 | (ecase format 215 | (deflate (run-salza2 'salza2:deflate-compressor input output)) 216 | (zlib (run-salza2 'salza2:zlib-compressor input output)) 217 | (gzip (run-salza2 'salza2:gzip-compressor input output)) 218 | (bzip2 (run-external output "bzip2" "-c" (namestring input))))))))) 219 | -------------------------------------------------------------------------------- /types-and-tables.lisp: -------------------------------------------------------------------------------- 1 | (in-package :chipz) 2 | 3 | (deftype index () '(mod #.array-dimension-limit)) 4 | 5 | (deftype simple-octet-vector (&optional length) 6 | (let ((length (or length '*))) 7 | `(simple-array (unsigned-byte 8) (,length)))) 8 | 9 | (deftype deflate-code-length () '(integer 0 #.+max-code-length+)) 10 | (deftype deflate-code () '(unsigned-byte #.+max-code-length+)) 11 | (deftype deflate-code-value () '(integer 0 (#.+max-codes+))) 12 | 13 | (defparameter *distance-code-extra-bits* 14 | ;; codes 30 and 31 will never actually appear, but we represent them 15 | ;; for completeness' sake 16 | #(0 0 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 0 0)) 17 | (defparameter *distance-code-base-distances* 18 | #(1 2 3 4 5 7 9 13 17 25 33 49 65 97 129 193 257 385 513 769 19 | 1025 1537 2049 3073 4097 6145 8193 12289 16385 24577)) 20 | 21 | (declaim (inline n-length-extra-bits n-distance-extra-bits length-base distance-base)) 22 | (defun n-length-extra-bits (value) 23 | (aref +length-code-extra-bits+ value)) 24 | 25 | (defun n-distance-extra-bits (distance-code) 26 | (svref *distance-code-extra-bits* distance-code)) 27 | 28 | (defun length-base (value) 29 | (aref +length-code-base-lengths+ value)) 30 | 31 | (defun distance-base (distance-code) 32 | (svref *distance-code-base-distances* distance-code)) 33 | 34 | (defparameter *code-length-code-order* 35 | #(16 17 18 0 8 7 9 6 10 5 11 4 12 3 13 2 14 1 15)) 36 | 37 | (eval-when (:compile-toplevel :load-toplevel :execute) 38 | (defstruct (code-range-descriptor 39 | (:conc-name code-) 40 | (:constructor make-crd (n-bits start-value end-value))) 41 | (n-bits 0 :type deflate-code-length) 42 | (start-value 0 :type deflate-code-value) 43 | (end-value 0 :type deflate-code-value)) 44 | 45 | (defstruct (huffman-decode-table 46 | (:conc-name hdt-) 47 | (:constructor make-hdt (counts offsets symbols bits))) 48 | ;; FIXME: look into combining these two into one array for speed. 49 | (counts #1=(error "required parameter") 50 | :type (simple-array (unsigned-byte 16) (#.+max-code-length+)) 51 | :read-only t) 52 | (offsets #1# :type (simple-array (unsigned-byte 16) (#.(1+ +max-code-length+))) 53 | :read-only t) 54 | (symbols nil :read-only t :type (simple-array fixnum (*))) 55 | (bits nil :read-only t)) 56 | ) ; EVAL-WHEN 57 | 58 | 59 | ;;; decode table construction 60 | 61 | (defun construct-huffman-decode-table (code-lengths &optional n-syms) 62 | (let* ((n-syms (or n-syms (length code-lengths))) 63 | (min-code-length +max-code-length+) 64 | (max-code-length 0) 65 | (counts (make-array +max-code-length+ :initial-element 0 66 | :element-type '(unsigned-byte 16))) 67 | (offsets (make-array (1+ +max-code-length+) :initial-element 0 68 | :element-type '(unsigned-byte 16))) 69 | (symbols (make-array n-syms :initial-element 0 :element-type 'fixnum))) 70 | (declare (type (simple-array (unsigned-byte 16) (*)) counts) 71 | (type (simple-array fixnum (*)) symbols)) 72 | (dotimes (i n-syms) 73 | (let ((c (aref code-lengths i))) 74 | (setf min-code-length (min min-code-length c)) 75 | (setf max-code-length (max max-code-length c)) 76 | (incf (aref counts c)))) 77 | ;; generate offsets 78 | (loop for i from 1 below +deflate-max-bits+ 79 | do (setf (aref offsets (1+ i)) (+ (aref offsets i) (aref counts i)))) 80 | (dotimes (i n-syms (make-hdt counts offsets symbols max-code-length)) 81 | (let ((l (aref code-lengths i))) 82 | (unless (zerop l) 83 | (setf (aref symbols (aref offsets l)) i) 84 | (incf (aref offsets l))))))) 85 | 86 | 87 | ;;; decoders for fixed compression blocks 88 | 89 | (defparameter *fixed-block-code-lengths* 90 | (map 'list #'make-crd 91 | '(8 9 7 8) ; lengths 92 | '(0 144 256 280) ; start values 93 | '(143 255 279 287))) ; end values 94 | 95 | (defparameter *fixed-block-distance-lengths* 96 | (list (make-crd 5 0 29))) 97 | 98 | (defun code-n-values (c) 99 | (1+ (- (code-end-value c) (code-start-value c)))) 100 | 101 | (defun compute-huffman-decode-structure (code-descriptors) 102 | (let* ((n-syms (loop for cd in code-descriptors 103 | sum (code-n-values cd))) 104 | (code-lengths (make-array n-syms :element-type '(unsigned-byte 16)))) 105 | (dolist (cd code-descriptors) 106 | (fill code-lengths (code-n-bits cd) 107 | :start (code-start-value cd) :end (1+ (code-end-value cd)))) 108 | (construct-huffman-decode-table code-lengths))) 109 | 110 | (defparameter *fixed-literal/length-table* 111 | (compute-huffman-decode-structure *fixed-block-code-lengths*)) 112 | (defparameter *fixed-distance-table* 113 | (compute-huffman-decode-structure *fixed-block-distance-lengths*)) 114 | 115 | (defmacro probably-the-fixnum (form) 116 | #+sbcl 117 | `(sb-ext:truly-the fixnum ,form) 118 | #-sbcl 119 | form) 120 | 121 | ;;; I want to make this work, but it drastically slows the code down in 122 | ;;; sbcl. Part of this is due to bad code generation (jump to jump to 123 | ;;; jump, yuck). 124 | #+nil 125 | (defun decode-value (table state) 126 | (declare (type huffman-decode-table table)) 127 | (declare (type inflate-state state)) 128 | (declare (optimize (speed 3))) 129 | (do ((bits (inflate-state-bits state)) 130 | (n-bits (inflate-state-n-bits state)) 131 | (counts (hdt-counts table)) 132 | (len 1) 133 | (first 0) 134 | (code 0)) 135 | (nil nil) 136 | (declare (type (unsigned-byte 32) bits)) 137 | (declare (type (integer 0 32) n-bits)) 138 | (declare (type (and fixnum (integer 0 *)) first code)) 139 | (do () 140 | ((zerop n-bits) 141 | (when (= (inflate-state-input-index state) 142 | (inflate-state-input-end state)) 143 | (throw 'inflate-done nil)) 144 | (setf bits (aref (inflate-state-input state) 145 | (inflate-state-input-index state))) 146 | (setf (inflate-state-input-index state) 147 | (sb-ext:truly-the fixnum (1+ (inflate-state-input-index state)))) 148 | (setf n-bits 8)) 149 | ;; We would normally do this with READ-BITS, but DECODE-VALUE 150 | ;; is a hotspot in profiles along with this would-be call to 151 | ;; READ-BITS, so we inline it all here. 152 | (setf code (logior code (logand bits 1)) 153 | bits (ash bits -1)) 154 | (decf n-bits) 155 | (let ((count (aref counts len))) 156 | (when (< (- code count) first) 157 | (setf (inflate-state-bits state) bits) 158 | (setf (inflate-state-n-bits state) n-bits) 159 | (return-from decode-value (aref (hdt-symbols table) 160 | #+sbcl 161 | (sb-ext:truly-the fixnum 162 | #3=(+ (aref (hdt-offsets table) (1- len)) 163 | (- code first))) 164 | #-sbcl #3#))) 165 | (setf first 166 | #+sbcl (sb-ext:truly-the fixnum (+ first count)) 167 | #-sbcl (+ first count) 168 | first 169 | #+sbcl (sb-ext:truly-the fixnum #1=(ash first 1)) 170 | #-sbcl #1# 171 | code 172 | #+sbcl (sb-ext:truly-the fixnum #2=(ash code 1)) 173 | #-sbcl #2# 174 | len (1+ len)))))) 175 | -------------------------------------------------------------------------------- /zlib.lisp: -------------------------------------------------------------------------------- 1 | ;;;; zlib.lisp -- dealing with zlib-wrapped deflate data 2 | 3 | (in-package :chipz) 4 | 5 | (defclass zlib-header () 6 | ((flags :initarg :flags :accessor flags) 7 | (cmf :initarg :cmf :accessor cmf) 8 | (fdict :initarg :fdict :accessor fdict) 9 | (adler32 :initarg :adler32 :accessor adler32))) 10 | 11 | (defconstant +zlib-compression-method+ 8) 12 | 13 | (defun zlib-compression-method (cmf-byte) 14 | (declare (type (unsigned-byte 8) cmf-byte)) 15 | (ldb (byte 4 0) cmf-byte)) 16 | 17 | (defun zlib-compression-info (cmf-byte) 18 | (declare (type (unsigned-byte 8) cmf-byte)) 19 | (ldb (byte 4 4) cmf-byte)) 20 | 21 | (defconstant +zlib-flag-fdict+ 5) 22 | 23 | (defun zlib-flag-fcheck (flag-byte) 24 | (declare (type (unsigned-byte 8) flag-byte)) 25 | (ldb (byte 4 0) flag-byte)) 26 | 27 | (defconstant +zlib-flevel-fastest+ 0) 28 | (defconstant +zlib-flevel-fast+ 1) 29 | (defconstant +zlib-flevel-default+ 2) 30 | (defconstant +zlib-flevel-maximum+ 3) 31 | 32 | (defun zlib-flag-flevel (flag-byte) 33 | (declare (type (unsigned-byte 8) flag-byte)) 34 | (ldb (byte 2 6) flag-byte)) 35 | --------------------------------------------------------------------------------