├── .gitignore ├── .travis.yml ├── LICENSE.txt ├── README.md ├── cl-hamt-examples.asd ├── cl-hamt-test.asd ├── cl-hamt.asd ├── examples ├── cl-hamt.ipynb ├── lexicon.lisp └── package.lisp ├── src ├── README.md ├── hamt.lisp ├── hash-dict.lisp ├── hash-set.lisp ├── package.lisp └── util.lisp └── test ├── benchmarks.lisp ├── hash-dict-test.lisp ├── hash-set-test.lisp └── package.lisp /.gitignore: -------------------------------------------------------------------------------- 1 | *.FASL 2 | *.fasl 3 | *.lisp-temp 4 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: common-lisp 2 | sudo: required 3 | 4 | env: 5 | matrix: 6 | - LISP=sbcl 7 | - LISP=ccl 8 | 9 | install: 10 | - curl https://raw.githubusercontent.com/luismbo/cl-travis/master/install.sh | bash 11 | - git clone https://github.com/ruricolist/cl-murmurhash.git ~/lisp/cl-murmurhash/ 12 | 13 | script: 14 | - cl -l fiveam 15 | -e '(setf fiveam:*debug-on-error* t 16 | fiveam:*debug-on-failure* t)' 17 | -e '(setf *debugger-hook* 18 | (lambda (c h) 19 | (declare (ignore c h)) 20 | (uiop:quit -1)))' 21 | -e '(ql:quickload :cl-hamt-test)' 22 | 23 | notifications: 24 | email: 25 | - shapero.daniel@gmail.com -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | Copyright 2015-2017 Daniel Shapero 2 | 3 | Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 4 | 5 | 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 6 | 7 | 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 8 | 9 | 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. 10 | 11 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 12 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | cl-hamt 2 | ======= 3 | 4 | [![Build Status](https://travis-ci.org/danshapero/cl-hamt.svg?branch=master)](https://travis-ci.org/danshapero/cl-hamt) 5 | 6 | This library provides purely functional dictionaries and sets in Common Lisp based on the hash array-mapped trie data structure. 7 | The operations provided are: 8 | ``` 9 | size 10 | lookup 11 | insert 12 | remove 13 | reduce 14 | filter 15 | map 16 | eq 17 | ``` 18 | The versions for sets and dictionaries are obtained by prepending `set-` or `dict-` to the above symbols, so for example to lookup a key in a set and dictionary you would call `set-lookup` and `dict-lookup` respectively. 19 | An empty collection is created with the functions `empty-set` and `empty-dict`. 20 | 21 | See the `examples/` directory for some usage examples of the library, or the unit tests. 22 | Some benchmark code can be found in `tests/benchmarks.lisp`. 23 | 24 | 25 | Implementation 26 | ============== 27 | 28 | The data types in cl-hamt are implemented using the [hash array-mapped trie](https://idea.popcount.org/2012-07-25-introduction-to-hamt/) (HAMT) data structure, as found in the Clojure language. 29 | HAMTs provide near-constant time search, insertion and removal and can be used as persistent data structures, i.e. all updates are non-destructive. 30 | 31 | As the name suggests, hash array-mapped tries use hashing of the underlying data to store and retrieve it efficiently. 32 | Consequently, any natural ordering on the data, e.g. lexicographic ordering of strings, natural ordering of integers, is not preserved in a HAMT. 33 | When using `reduce` on a collection, the operation in question must not depend on the order in which the elements are accessed. 34 | If the data are ordered and this ordering is important, a self-balancing binary tree may be a more appropriate data structure. 35 | Additionally, one must provide an appropriate 32-bit hash function. 36 | We default to using `murmurhash`, as implemented in the Common Lisp package `cl-murmurhash`. 37 | Note that the built-in Common Lisp hash function `sxhash` is not a 32-bit hash; for example, on my 64-bit system with SBCL, it returns a 62-bit hash. 38 | 39 | While most operations on HAMTs have a complexity of log base-32 in the size of the data structure, there is quite a bit of overhead. 40 | HAMTs are probably less efficient for repeated operations on small-size sets and dictionaries than, say, a list or an association list. 41 | -------------------------------------------------------------------------------- /cl-hamt-examples.asd: -------------------------------------------------------------------------------- 1 | 2 | (asdf:defsystem #:cl-hamt-examples 3 | :description "Example code for using cl-hamt" 4 | :author "Daniel Shapero " 5 | :license "BSD 3-clause" 6 | :depends-on (#:cl-hamt 7 | #:cl-ppcre 8 | #:drakma) 9 | :serial t 10 | :components 11 | ((:module "examples" 12 | :serial t 13 | :components 14 | ((:file "package") 15 | (:file "lexicon"))))) 16 | -------------------------------------------------------------------------------- /cl-hamt-test.asd: -------------------------------------------------------------------------------- 1 | 2 | (asdf:defsystem #:cl-hamt-test 3 | :description "Unit tests and benchmarks for cl-hamt" 4 | :author "Daniel Shapero " 5 | :license "BSD 3-clause" 6 | :depends-on (#:cl-hamt 7 | #:fiveam) 8 | :serial t 9 | :components 10 | ((:module "test" 11 | :serial t 12 | :components 13 | ((:file "package") 14 | (:file "hash-dict-test") 15 | (:file "hash-set-test") 16 | (:file "benchmarks"))))) 17 | 18 | -------------------------------------------------------------------------------- /cl-hamt.asd: -------------------------------------------------------------------------------- 1 | 2 | (asdf:defsystem #:cl-hamt 3 | :description "Dictionary & set data structure using hash array-mapped tries" 4 | :author "Daniel Shapero " 5 | :license "BSD 3-clause" 6 | :depends-on (#:cl-murmurhash) 7 | :serial t 8 | :components 9 | ((:module "src" 10 | :serial t 11 | :components 12 | ((:file "package") 13 | (:file "util") 14 | (:file "hamt") 15 | (:file "hash-dict") 16 | (:file "hash-set"))))) 17 | -------------------------------------------------------------------------------- /examples/cl-hamt.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "(ql:quickload :cl-hamt)" 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": null, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "(use-package :cl-hamt)" 19 | ] 20 | }, 21 | { 22 | "cell_type": "markdown", 23 | "metadata": {}, 24 | "source": [ 25 | "# cl-hamt\n", 26 | "\n", 27 | "This notebook demonstrates how to use the Common Lisp library `cl-hamt`.\n", 28 | "`cl-hamt` provides data structures for sets and dictionaries based on [hash array-mapped tries](https://en.wikipedia.org/wiki/Hash_array_mapped_trie) (HAMTs).\n", 29 | "This data type provides insertion, deletion, and lookup of entries in a collection of size $n$ in $\\log_{32}(n)$ time.\n", 30 | "The space usage is asymptotic to $n\\cdot\\log_{32}(n)$ + some constant overhead.\n", 31 | "For large collections, this overhead is a small fraction of the whole, but if you're working with a large number of small-size collections, a list or an association list might be more appropriate.\n", 32 | "The dictionary data type in the [Clojure](https://en.wikipedia.org/wiki/Clojure) programming language's standard library is implemented using HAMTs.\n", 33 | "\n", 34 | "The implementation of HAMTs in this library is fully persistent.\n", 35 | "A persistent collection is never truly modified; rather, when one wishes to add an entry to a collection, a new collection is returned which contains the new element and the old, unmodified collection is preserved.\n", 36 | "The new augmented collection, however, shares as much structure as possible with the old collection.\n", 37 | "The garbage collector cleans up any old versions of data structures for us if we're not using them anymore.\n", 38 | "All told, these persistent collections don't use much more memory than their imperative counterparts.\n", 39 | "Since persistent collections are fundamentally immutable and one never makes destructive updates, they can be much easier to reason about and debug." 40 | ] 41 | }, 42 | { 43 | "cell_type": "markdown", 44 | "metadata": {}, 45 | "source": [ 46 | "### Hash sets\n", 47 | "\n", 48 | "Some basic usage -- adding, removing entries -- of the set API is shown below.\n", 49 | "The function `empty-set` creates a set with no entries in it; to populate it, call the function `set-insert`." 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": null, 55 | "metadata": {}, 56 | "outputs": [], 57 | "source": [ 58 | "(empty-set)" 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": null, 64 | "metadata": {}, 65 | "outputs": [], 66 | "source": [ 67 | "(defvar musicians\n", 68 | " (set-insert (empty-set)\n", 69 | " \"Miles Davis\"\n", 70 | " \"John Coltrane\"\n", 71 | " \"Charlie Parker\"\n", 72 | " \"Dizzy Gillespie\"\n", 73 | " \"Mary Lou Williams\"\n", 74 | " \"Ella Fitzgerald\"\n", 75 | " \"Nina Simone\"\n", 76 | " \"Jascha Heifetz\"\n", 77 | " \"Hilary Hahn\"\n", 78 | " \"Stefan Grappelli\"\n", 79 | " \"Chubby Wise\"\n", 80 | " \"Lester Flatt\"\n", 81 | " \"Earl Scruggs\"\n", 82 | " \"Bill Monroe\"))" 83 | ] 84 | }, 85 | { 86 | "cell_type": "markdown", 87 | "metadata": {}, 88 | "source": [ 89 | "There are two query operations we can perform on sets: getting the size of the set, and returning whether or not some object is contained in the set." 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": null, 95 | "metadata": {}, 96 | "outputs": [], 97 | "source": [ 98 | "(set-size musicians)" 99 | ] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "execution_count": null, 104 | "metadata": {}, 105 | "outputs": [], 106 | "source": [ 107 | "(set-lookup musicians \"Nina Simone\")" 108 | ] 109 | }, 110 | { 111 | "cell_type": "code", 112 | "execution_count": null, 113 | "metadata": {}, 114 | "outputs": [], 115 | "source": [ 116 | "(set-lookup musicians \"Teo Macero\")" 117 | ] 118 | }, 119 | { 120 | "cell_type": "markdown", 121 | "metadata": {}, 122 | "source": [ 123 | "Adding an entry to a set creates a new set; it does not modify the old one." 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": null, 129 | "metadata": {}, 130 | "outputs": [], 131 | "source": [ 132 | "(defvar even-more-musicians (set-insert musicians \"Birgit Nilsson\"))" 133 | ] 134 | }, 135 | { 136 | "cell_type": "code", 137 | "execution_count": null, 138 | "metadata": {}, 139 | "outputs": [], 140 | "source": [ 141 | "(set-lookup musicians \"Birgit Nilsson\")" 142 | ] 143 | }, 144 | { 145 | "cell_type": "code", 146 | "execution_count": null, 147 | "metadata": {}, 148 | "outputs": [], 149 | "source": [ 150 | "(set-lookup even-more-musicians \"Birgit Nilsson\")" 151 | ] 152 | }, 153 | { 154 | "cell_type": "markdown", 155 | "metadata": {}, 156 | "source": [ 157 | "Reducing over a collection is the key means of performing an operation on all of its elements.\n", 158 | "The code below starts with an empty string, then concatenates the name of every musician in the set with a newline after." 159 | ] 160 | }, 161 | { 162 | "cell_type": "code", 163 | "execution_count": null, 164 | "metadata": {}, 165 | "outputs": [], 166 | "source": [ 167 | "(set-reduce (lambda (name str)\n", 168 | " (concatenate 'string str (string #\\linefeed) name))\n", 169 | " musicians\n", 170 | " \"\")" 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "metadata": {}, 176 | "source": [ 177 | "Note that HAMTs store the input by hashing it.\n", 178 | "Hashing does not preserve any natural ordering (e.g. lexicographic) of the entries.\n", 179 | "Nor does it preserve the order in which they were inserted." 180 | ] 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "metadata": {}, 185 | "source": [ 186 | "### Hash maps\n", 187 | "\n", 188 | "Dictionaries work just like sets do and the names of all the API functions are the same.\n", 189 | "The main difference is that, when inserting entries into a dictionary, the arguments come in alternating key-value pairs." 190 | ] 191 | }, 192 | { 193 | "cell_type": "code", 194 | "execution_count": null, 195 | "metadata": {}, 196 | "outputs": [], 197 | "source": [ 198 | "(defvar musicians->instruments\n", 199 | " (dict-insert (empty-dict)\n", 200 | " \"Miles Davis\" \"trumpet\"\n", 201 | " \"John Coltrane\" \"tenor sax\"\n", 202 | " \"Charlie Parker\" \"alto sax\"\n", 203 | " \"Dizzy Gillespie\" \"trumpet\"\n", 204 | " \"Mary Lou Williams\" \"piano\"\n", 205 | " \"Ella Fitzgerald\" \"voice\"\n", 206 | " \"Nina Simone\" \"voice\"\n", 207 | " \"Jascha Heifetz\" \"violin\"\n", 208 | " \"Hilary Hahn\" \"violin\"\n", 209 | " \"Stefan Grappelli\" \"violin\"\n", 210 | " \"Chubby Wise\" \"violin\"\n", 211 | " \"Lester Flatt\" \"guitar\"\n", 212 | " \"Earl Scruggs\" \"banjo\"\n", 213 | " \"Bill Monroe\" \"mandolin\"))" 214 | ] 215 | }, 216 | { 217 | "cell_type": "markdown", 218 | "metadata": {}, 219 | "source": [ 220 | "Rather than returning merely true or false when looking up whether an entry is in a dictionary, the function `dict-lookup` returns the value that this key is mapped to." 221 | ] 222 | }, 223 | { 224 | "cell_type": "code", 225 | "execution_count": null, 226 | "metadata": {}, 227 | "outputs": [], 228 | "source": [ 229 | "(dict-lookup musicians->instruments \"Dizzy Gillespie\")" 230 | ] 231 | }, 232 | { 233 | "cell_type": "markdown", 234 | "metadata": {}, 235 | "source": [ 236 | "Looking up a key that is not in the dictionary will return nil.\n", 237 | "Consequently if you were to store a key that actually is mapped to nil, this would be indistinguishable from the key not being in the dictionary at all." 238 | ] 239 | }, 240 | { 241 | "cell_type": "code", 242 | "execution_count": null, 243 | "metadata": {}, 244 | "outputs": [], 245 | "source": [ 246 | "(dict-lookup musicians->instruments \"Karl Rove\")" 247 | ] 248 | }, 249 | { 250 | "cell_type": "markdown", 251 | "metadata": {}, 252 | "source": [ 253 | "The same operations (filter and reduce) are defined for dictionaries as for sets, but the signature of the filtering or reducing function you pass to it takes in both the key and the value.\n", 254 | "The code below iterates over the dictionary and creates a list of the names of all the violinists." 255 | ] 256 | }, 257 | { 258 | "cell_type": "code", 259 | "execution_count": null, 260 | "metadata": {}, 261 | "outputs": [], 262 | "source": [ 263 | "(dict-reduce (lambda (violinists name instrument)\n", 264 | " (if (equal instrument \"violin\")\n", 265 | " (cons name violinists)\n", 266 | " violinists))\n", 267 | " musicians->instruments\n", 268 | " '())" 269 | ] 270 | }, 271 | { 272 | "cell_type": "markdown", 273 | "metadata": {}, 274 | "source": [ 275 | "For dictionaries, there are specialized versions of reduce for operating only on the dictionary keys or values.\n", 276 | "The function below counts the number of violinists rather than assembling all of them into a list." 277 | ] 278 | }, 279 | { 280 | "cell_type": "code", 281 | "execution_count": null, 282 | "metadata": {}, 283 | "outputs": [], 284 | "source": [ 285 | "(dict-reduce-values (lambda (count instrument)\n", 286 | " (+ count (if (equal instrument \"violin\") 1 0)))\n", 287 | " musicians->instruments\n", 288 | " 0)" 289 | ] 290 | } 291 | ], 292 | "metadata": { 293 | "kernelspec": { 294 | "display_name": "SBCL Lisp", 295 | "language": "lisp", 296 | "name": "lisp" 297 | }, 298 | "language_info": { 299 | "codemirror_mode": "text/x-common-lisp", 300 | "mimetype": "text/x-common-lisp", 301 | "name": "common-lisp", 302 | "pygments_lexer": "common-lisp", 303 | "version": "X3J13" 304 | } 305 | }, 306 | "nbformat": 4, 307 | "nbformat_minor": 1 308 | } 309 | -------------------------------------------------------------------------------- /examples/lexicon.lisp: -------------------------------------------------------------------------------- 1 | 2 | (in-package #:cl-hamt-examples) 3 | 4 | (defvar ml-databases-uri 5 | "https://archive.ics.uci.edu/ml/machine-learning-databases/") 6 | 7 | (defvar bag-of-words-uri 8 | (concatenate 'string ml-databases-uri "bag-of-words/")) 9 | 10 | (defun fetch-corpus (corpus) 11 | "Given the name of a corpus, fetch the corresponding bag of words 12 | from the online database." 13 | (let* ((uri (concatenate 'string 14 | bag-of-words-uri "vocab." corpus ".txt")) 15 | (text (drakma:http-request uri))) 16 | (reduce #'set-insert 17 | (cl-ppcre:all-matches-as-strings "\\w+" text) 18 | :initial-value (empty-set)))) 19 | 20 | (defun fetch-corpora () 21 | "Fetch all the word bags from UCI's database." 22 | (reduce (lambda (dict corpus) 23 | (dict-insert dict corpus (fetch-corpus corpus))) 24 | '("enron" "kos" "nips" "nytimes" "pubmed") 25 | :initial-value (empty-dict))) 26 | 27 | (defun common-words () 28 | "Return a set of words that are found in every corpus" 29 | (let ((corpora (fetch-corpora))) 30 | (dict-reduce-values #'set-intersection 31 | corpora 32 | (dict-lookup corpora "kos")))) 33 | -------------------------------------------------------------------------------- /examples/package.lisp: -------------------------------------------------------------------------------- 1 | 2 | (defpackage #:cl-hamt-examples 3 | (:use #:cl #:cl-hamt) 4 | (:export #:fetch-corpora 5 | #:common-words)) 6 | -------------------------------------------------------------------------------- /src/README.md: -------------------------------------------------------------------------------- 1 | 2 | ## A word on the code organization 3 | 4 | The `hash-dict` and `hash-set` data structures that the user will see are wrappers around classes for the trie's nodes. 5 | There are three types of trie nodes: leaves, which store the actual data; table nodes, which store a bitmap and array to index the key hashes; and conflict nodes, which are like leaves in the event of a hash collision. 6 | These are defined in `hamt.lisp`. 7 | The call signature for operations on these classes is different from that of the wrapper types because the hash and tree depth must be passed explicitly. 8 | Operations on these classes are prepended with a `%`. 9 | 10 | Some of the operations, such as getting the size of a collection or removing a key from it, can be implemented the same way whether the HAMT is storing a set or a dictionary. 11 | Other operations differ between the two; for example, accessing elements in conflict nodes is different for sets and dictionaries because a set stores a list while a dictionary stores an association list when there is a hash collision. 12 | Shared code is found in `hamt.lisp`, while operations specific to dictionaries are sets are found in `hash-dict.lisp` and `hash-set.lisp` respectively. 13 | 14 | Operations on table nodes require lots of repetitive bit manipulation. 15 | This has been factored out into the macro `with-table` in `hamt.lisp`. 16 | 17 | Many of the higher-level operations on collections are implemented in terms of `reduce`. -------------------------------------------------------------------------------- /src/hamt.lisp: -------------------------------------------------------------------------------- 1 | 2 | (in-package #:cl-hamt) 3 | 4 | (defclass leaf () 5 | ((key 6 | :reader node-key 7 | :initarg :key 8 | :initform nil))) 9 | 10 | (defclass conflict () 11 | ((hash 12 | :reader conflict-hash 13 | :initarg :hash) 14 | (entries 15 | :reader conflict-entries 16 | :initarg :entries 17 | :initform '()))) 18 | 19 | (defclass table () 20 | ((bitmap 21 | :reader table-bitmap 22 | :initarg :bitmap 23 | :initform 0 24 | :type (unsigned-byte 32)) 25 | (table 26 | :reader table-array 27 | :initarg :table 28 | :initform (make-array 0 :initial-element nil)))) 29 | 30 | ;; Base HAMT class 31 | (defclass hamt () 32 | ((test 33 | :reader hamt-test 34 | :initarg :test 35 | :initform #'equal) 36 | (hash 37 | :reader hamt-hash 38 | :initarg :hash 39 | :initform #'cl-murmurhash:murmurhash) 40 | (table 41 | :reader hamt-table 42 | :initarg :table))) 43 | 44 | 45 | (defmacro with-hamt (hamt (&key test hash table) &body body) 46 | "Accessing HAMT slots" 47 | `(with-accessors ((,test hamt-test) 48 | (,hash hamt-hash) 49 | (,table hamt-table)) 50 | ,hamt 51 | ,@body)) 52 | 53 | (defmacro with-table (node hash depth 54 | (bitmap array bits index hit) 55 | &body body) 56 | "Bitshifting in HAMT tables" 57 | `(let* ((,bitmap (table-bitmap ,node)) 58 | (,array (table-array ,node)) 59 | (,bits (get-bits ,hash ,depth)) 60 | (,index (get-index ,bits ,bitmap)) 61 | (,hit (logbitp ,bits ,bitmap))) 62 | ,@body)) 63 | 64 | 65 | ;; Getting the size of a HAMT 66 | (defgeneric %hamt-size (node)) 67 | 68 | (defmethod %hamt-size ((node leaf)) 69 | 1) 70 | 71 | (defmethod %hamt-size ((node table)) 72 | (loop for node across (table-array node) 73 | sum (%hamt-size node))) 74 | 75 | (defmethod %hamt-size ((node conflict)) 76 | (length (conflict-entries node))) 77 | 78 | 79 | ;; Depending on whether the HAMT is a set or a dict, looking up an entry 80 | ;; returns either a boolean or multiple values respectively, so we defer 81 | ;; implementation to the respective classes. 82 | (defgeneric %hamt-lookup (node key hash depth test)) 83 | 84 | 85 | ;; Removing entries from HAMTs 86 | (defgeneric %hamt-remove (node key hash depth test)) 87 | 88 | (defmethod %hamt-remove ((node leaf) key hash depth test) 89 | (let ((nkey (node-key node))) 90 | (unless (funcall test key nkey) 91 | node))) 92 | 93 | ;; Removing entries from a conflict node differs for sets and dicts 94 | 95 | ;; Removing a key from a table node can mean updating its bitmap if there 96 | ;; is nothing left in the corresponding branch. 97 | (defmethod %hamt-remove ((node table) key hash depth test) 98 | (with-table node hash depth 99 | (bitmap array bits index hit) 100 | (if (not hit) 101 | node 102 | (let ((new-node 103 | (%hamt-remove (aref array index) key hash (1+ depth) test))) 104 | (cond 105 | (new-node 106 | (make-instance (type-of node) 107 | :bitmap bitmap 108 | :table (vec-update array index new-node))) 109 | ((= bitmap 1) nil) 110 | (t (make-instance (type-of node) 111 | :bitmap (logxor bitmap (ash 1 bits)) 112 | :table (vec-remove array index)))))))) 113 | 114 | 115 | ;; Reducing over a HAMT is the same for table nodes of sets and dicts 116 | (defmethod %hamt-reduce (func node initial-value)) 117 | 118 | (defmethod %hamt-reduce (func (node table) initial-value) 119 | (reduce (lambda (r child) 120 | (%hamt-reduce func child r)) 121 | (table-array node) 122 | :initial-value initial-value)) 123 | 124 | ;; Helpers for defining equality between hash sets/dictionaries 125 | (define-condition incompatible-tests-error (error) 126 | ()) 127 | 128 | (defun array-eq (arr1 arr2 test) 129 | (let ((n (length arr1))) 130 | (if (not (= n (length arr2))) 131 | nil 132 | (do ((i 0 (+ i 1))) 133 | ((or (= i n) 134 | (not (funcall test (elt arr1 i) (elt arr2 i)))) 135 | (= i n)))))) 136 | -------------------------------------------------------------------------------- /src/hash-dict.lisp: -------------------------------------------------------------------------------- 1 | (in-package #:cl-hamt) 2 | 3 | ;; Leaves in a dictionary also store the value contained at the node, as well 4 | ;; as a key 5 | (defclass dict-leaf (leaf) 6 | ((value 7 | :reader node-value 8 | :initarg :value 9 | :initform nil))) 10 | 11 | ;; These classes give extra information to dispatch on, e.g. for looking up 12 | ;; entries in conflict nodes 13 | (defclass dict-conflict (conflict) ()) 14 | (defclass dict-table (table) ()) 15 | 16 | 17 | 18 | ;; Methods for looking up key/value pairs in a dict 19 | (defmethod %hamt-lookup ((node dict-leaf) key hash depth test) 20 | (if (funcall test (node-key node) key) 21 | (values (node-value node) t) 22 | (values nil nil))) 23 | 24 | (defmethod %hamt-lookup ((node dict-table) key hash depth test) 25 | (with-table node hash depth 26 | (bitmap array bits index hit) 27 | (if hit 28 | (%hamt-lookup (aref array index) key hash (1+ depth) test) 29 | (values nil nil)))) 30 | 31 | (defmethod %hamt-lookup ((node dict-conflict) key hash depth test) 32 | (declare (ignore hash depth)) 33 | (let ((key-val (assoc key (conflict-entries node) :test test))) 34 | (if key-val 35 | (values (cdr key-val) t) 36 | (values nil nil)))) 37 | 38 | 39 | 40 | ;; Methods for inserting key/value pairs into a dict 41 | (defgeneric %dict-insert (node key value hash depth test)) 42 | 43 | ;; Inserting into a leaf either functionally updates the value stored in the 44 | ;; current node if the keys match, or creates a conflict node if the keys do 45 | ;; not match but their hashes do. 46 | (defmethod %dict-insert ((node dict-leaf) key value hash depth test) 47 | (declare (ignore depth)) 48 | (let ((nkey (node-key node))) 49 | (if (funcall test key nkey) 50 | (make-instance 'dict-leaf 51 | :key key 52 | :value value) 53 | (make-instance 'dict-conflict 54 | :hash hash 55 | :entries (acons key 56 | value 57 | (acons nkey 58 | (node-value node) 59 | '())))))) 60 | 61 | ;; Inserting into a conflict node either updates the value associated to an 62 | ;; existing key, or expands the scope of the conflict 63 | (defmethod %dict-insert ((node dict-conflict) key value hash depth test) 64 | (let ((entries (conflict-entries node))) 65 | (make-instance 'dict-conflict 66 | :hash hash 67 | :entries (if (assoc key entries) 68 | (mapcar (lambda (kv) 69 | (if (funcall test (car kv) key) 70 | (cons key value) 71 | (cons (car kv) (cdr kv)))) 72 | entries) 73 | (acons key value entries))))) 74 | 75 | (defmethod %dict-insert ((node dict-table) key value hash depth test) 76 | (with-table node hash depth 77 | (bitmap array bits index hit) 78 | (flet ((%insert (table) 79 | (%dict-insert table key value hash (1+ depth) test))) 80 | (let ((new-node 81 | (cond 82 | (hit (%insert (aref array index))) 83 | ((= depth 6) (make-instance 'dict-leaf 84 | :key key 85 | :value value)) 86 | (t (%insert (make-instance 'dict-table)))))) 87 | (make-instance 'dict-table 88 | :bitmap (logior bitmap (ash 1 bits)) 89 | :table (funcall (if hit #'vec-update #'vec-insert) 90 | array 91 | index 92 | new-node)))))) 93 | 94 | 95 | 96 | ;; Removing entries from dictionaries. 97 | ;; Most of the functionality is contained in the file hamt.lisp. 98 | 99 | (defun alist-remove (key alist test) 100 | (remove key alist :test (lambda (k p) (funcall test (car p) k)))) 101 | 102 | ;; Removing an entry from a conflict node reduces the scope of the hash 103 | ;; collision. If there is now only 1 key with the given hash, we can 104 | ;; return a dict-leaf, since there is no longer a collision. 105 | (defmethod %hamt-remove ((node dict-conflict) key hash depth test) 106 | (let ((entries (alist-remove key (conflict-entries node) test))) 107 | (if (= (length entries) 1) 108 | (make-instance 'dict-leaf 109 | :key (caar entries) 110 | :value (cdar entries)) 111 | (make-instance 'dict-conflict 112 | :hash hash 113 | :entries entries)))) 114 | 115 | 116 | 117 | ;; Methods for reducing over elements of HAMTs 118 | (defmethod %hamt-reduce (func (node dict-leaf) initial-value) 119 | (funcall func initial-value (node-key node) (node-value node))) 120 | 121 | (defmethod %hamt-reduce (func (node dict-conflict) initial-value) 122 | (labels ((f (alist r) 123 | (if alist 124 | (f (cdr alist) 125 | (funcall func r (caar alist) (cdar alist))) 126 | r))) 127 | (f (conflict-entries node) initial-value))) 128 | 129 | 130 | 131 | ;; Wrapper dictionary class 132 | (defclass hash-dict (hamt) 133 | ((table 134 | :reader hamt-table 135 | :initarg :table 136 | :initform (make-instance 'dict-table 137 | :bitmap 0 138 | :table (make-array 0))))) 139 | 140 | (defun empty-dict (&key (test #'equal) (hash #'cl-murmurhash:murmurhash)) 141 | "Return an empty hash-dict, in which keys will be compared and hashed 142 | with the supplied test and hash functions. The hash must be a 32-bit hash." 143 | (make-instance 'hash-dict 144 | :test (ctypecase test 145 | (function test) 146 | (symbol (symbol-function test))) 147 | :hash hash)) 148 | 149 | (defun dict-lookup (dict key) 150 | "Multiply-return the value mapped to by the key in the dictionary and 151 | whether or not the value is present in the dictionary. 152 | The multiple return is necessary in case a key is present but maps to nil." 153 | (with-hamt dict (:test test :hash hash :table table) 154 | (%hamt-lookup table key (funcall hash key) 0 test))) 155 | 156 | (defun dict-size (dict) 157 | "Return the number of key/value pairs in the dict" 158 | (%hamt-size (hamt-table dict))) 159 | 160 | (defun dict-insert (dict &rest args) 161 | "Return a new dictionary with the key/value pairs added. The key/value 162 | pairs are assumed to be alternating in the &rest argument, so to add the 163 | key/value pairs (k1, v1), ..., (kn, vn), one would invoke 164 | (dict-insert dict k1 v1 ... kn vn). 165 | If any of the keys are already present in the dict passed, they are mapped 166 | to the new values in the returned dict." 167 | (with-hamt dict (:test test :hash hash :table table) 168 | (flet ((%insert (table key value) 169 | (%dict-insert table key value (funcall hash key) 0 test))) 170 | (make-instance 171 | 'hash-dict 172 | :test test 173 | :hash hash 174 | :table (labels ((f (table args) 175 | (if args 176 | (let ((key (car args)) 177 | (value (cadr args))) 178 | (f (%insert table key value) 179 | (cddr args))) 180 | table))) 181 | (f table args)))))) 182 | 183 | (defun dict-remove (dict &rest keys) 184 | "Return a new dict with the keys removed. Any keys passed that are not 185 | already present in the dict are ignored." 186 | (with-hamt dict (:test test :hash hash :table table) 187 | (flet ((%remove (table key) 188 | (%hamt-remove table key (funcall hash key) 0 test))) 189 | (make-instance 'hash-dict 190 | :test test 191 | :hash hash 192 | :table (reduce #'%remove keys :initial-value table))))) 193 | 194 | (defun dict-reduce (func dict initial-value) 195 | "Successively apply a function to key/value pairs of the dict. 196 | The function is assumed to have the signature 197 | `func :: A K V -> A`, 198 | where `A` is the type of the initial-value, `K` is the type of the dict 199 | keys and `V` is the type of dictionary values. 200 | Note that HAMTs do not store items in any order, so the reduction operation 201 | cannot be sensitive to the order in which the items are reduced." 202 | (%hamt-reduce func (hamt-table dict) initial-value)) 203 | 204 | (defun dict-clone (dict test hash) 205 | (empty-dict :test (if test test (hamt-test dict)) 206 | :hash (if hash hash (hamt-hash dict)))) 207 | 208 | (defun dict-map-values (func dict &key test hash) 209 | "Return a new dict with the values mapped by the given function. 210 | Optionally use new comparison and hash functions for the mapped dict." 211 | (dict-reduce (lambda (d k v) 212 | (dict-insert d k (funcall func v))) 213 | dict 214 | (dict-clone dict test hash))) 215 | 216 | (defun dict-map-keys (func dict &key test hash) 217 | "Return a new dict with the keys mapped by the given function." 218 | (dict-reduce (lambda (d k v) 219 | (dict-insert d (funcall func k) v)) 220 | dict 221 | (dict-clone dict test hash))) 222 | 223 | (defun dict-filter (predicate dict) 224 | "Return a new dict consisting of the key/value pairs satisfying the 225 | given predicate." 226 | (dict-reduce (lambda (filtered-dict k v) 227 | (if (funcall predicate k v) 228 | (dict-insert filtered-dict k v) 229 | filtered-dict)) 230 | dict 231 | (empty-dict :test (hamt-test dict) 232 | :hash (hamt-hash dict)))) 233 | 234 | (defun dict-reduce-keys (func dict initial-value) 235 | "Reducing over dictionary keys, ignoring the values." 236 | (flet ((f (r k v) 237 | (declare (ignore v)) 238 | (funcall func r k))) 239 | (dict-reduce #'f dict initial-value))) 240 | 241 | (defun dict-reduce-values (func dict initial-value) 242 | "Reducing over dictionary values, ignoring the keys." 243 | (flet ((f (r k v) 244 | (declare (ignore k)) 245 | (funcall func r v))) 246 | (dict-reduce #'f dict initial-value))) 247 | 248 | (defun dict->alist (dict) 249 | (dict-reduce (lambda (alist k v) 250 | (acons k v alist)) 251 | dict 252 | '())) 253 | 254 | ;; Methods for deciding if two dictionaries are equal 255 | (defgeneric %hash-dict-eq (dict1 dict2 key-test value-test)) 256 | 257 | (defmethod %hash-dict-eq (dict1 dict2 key-test value-test) 258 | (declare (ignore dict1 dict2 key-test value-test)) 259 | nil) 260 | 261 | (defmethod %hash-dict-eq ((node1 dict-leaf) 262 | (node2 dict-leaf) 263 | key-test 264 | value-test) 265 | (and (funcall key-test (node-key node1) (node-key node2)) 266 | (funcall value-test (node-value node1) (node-value node2)))) 267 | 268 | 269 | (defmethod %hash-dict-eq ((node1 dict-conflict) 270 | (node2 dict-conflict) 271 | key-test 272 | value-test) 273 | (and (equal (conflict-hash node1) (conflict-hash node2)) 274 | (labels ((alist-eq (alist1 alist2) 275 | (if (or (not alist1) (not alist2)) 276 | (and (not alist1) (not alist2)) 277 | (let ((key1 (caar alist1)) 278 | (key2 (caar alist2)) 279 | (value1 (cdar alist1)) 280 | (value2 (cdar alist2))) 281 | (when (and (funcall key-test key1 key2) 282 | (funcall value-test value1 value2)) 283 | (alist-eq (cdr alist1) (cdr alist2))))))) 284 | (alist-eq (conflict-entries node1) (conflict-entries node2))))) 285 | 286 | (defmethod %hash-dict-eq ((node1 dict-table) 287 | (node2 dict-table) 288 | key-test 289 | value-test) 290 | (and (equal (table-bitmap node1) (table-bitmap node2)) 291 | (array-eq (table-array node1) 292 | (table-array node2) 293 | (lambda (dict1 dict2) 294 | (%hash-dict-eq dict1 dict2 key-test value-test))))) 295 | 296 | (defun dict-eq (dict1 dict2 &key (value-test #'equal)) 297 | (let ((test1 (hamt-test dict1))) 298 | (if (not (eq test1 (hamt-test dict2))) 299 | (error 'incompatible-tests-error) 300 | (%hash-dict-eq (hamt-table dict1) (hamt-table dict2) test1 value-test)))) 301 | -------------------------------------------------------------------------------- /src/hash-set.lisp: -------------------------------------------------------------------------------- 1 | (in-package #:cl-hamt) 2 | 3 | (defclass set-leaf (leaf) ()) 4 | 5 | (defclass set-conflict (conflict) ()) 6 | (defclass set-table (table) ()) 7 | 8 | 9 | 10 | ;; Methods for looking up whether items are contained in the set 11 | (defmethod %hamt-lookup ((node set-leaf) key hash depth test) 12 | (declare (ignore hash depth)) 13 | (funcall test (node-key node) key)) 14 | 15 | (defmethod %hamt-lookup ((node set-table) key hash depth test) 16 | (with-table node hash depth 17 | (bitmap array bits index hit) 18 | (when hit 19 | (%hamt-lookup (aref array index) key hash (1+ depth) test)))) 20 | 21 | (defmethod %hamt-lookup ((node set-conflict) key hash depth test) 22 | (declare (ignore hash depth)) 23 | (member key (conflict-entries node) :test test)) 24 | 25 | 26 | 27 | ;; Methods for inserting items into the set 28 | (defgeneric %set-insert (node key hash depth test)) 29 | 30 | ;; Adding a new element to a leaf node either returns the leaf node if that 31 | ;; item was already present in the set, or creates a conflict node if there 32 | ;; was a hash collision. 33 | (defmethod %set-insert ((node set-leaf) key hash depth test) 34 | (declare (ignore depth)) 35 | (let ((nkey (node-key node))) 36 | (if (funcall test key nkey) 37 | node 38 | (make-instance 'set-conflict 39 | :hash hash 40 | :entries (list key nkey))))) 41 | 42 | (defmethod %set-insert ((node set-conflict) key hash depth test) 43 | (let ((entries (conflict-entries node))) 44 | (make-instance 'set-conflict 45 | :hash hash 46 | :entries (if (member key entries :test test) 47 | entries 48 | (cons key entries))))) 49 | 50 | (defmethod %set-insert ((node set-table) key hash depth test) 51 | (with-table node hash depth 52 | (bitmap array bits index hit) 53 | (flet ((%insert (table) 54 | (%set-insert table key hash (1+ depth) test))) 55 | (let ((new-node 56 | (cond 57 | (hit (%insert (aref array index))) 58 | ((= depth 6) (make-instance 'set-leaf :key key)) 59 | (t (%insert (make-instance 'set-table)))))) 60 | (make-instance 'set-table 61 | :bitmap (logior bitmap (ash 1 bits)) 62 | :table (funcall (if hit #'vec-update #'vec-insert) 63 | array 64 | index 65 | new-node)))))) 66 | 67 | 68 | 69 | ;; Methods for removing items from a hash-set 70 | (defmethod %hamt-remove ((node set-conflict) key hash depth test) 71 | (let ((entries (remove key (conflict-entries node) :test test))) 72 | (if (= (length entries) 1) 73 | (make-instance 'set-leaf 74 | :key (car entries)) 75 | (make-instance 'set-conflict 76 | :hash hash 77 | :entries entries)))) 78 | 79 | 80 | 81 | ;; Methods for reducing over elements of hash-sets 82 | (defmethod %hamt-reduce (func (node set-leaf) initial-value) 83 | (funcall func initial-value (node-key node))) 84 | 85 | (defmethod %hamt-reduce (func (node set-conflict) initial-value) 86 | (reduce func (conflict-entries node) :initial-value initial-value)) 87 | 88 | 89 | 90 | ;; Wrapper set class 91 | (defclass hash-set (hamt) 92 | ((table 93 | :reader hamt-table 94 | :initarg :table 95 | :initform (make-instance 'set-table 96 | :bitmap 0 97 | :table (make-array 0))))) 98 | 99 | (defun empty-set (&key (test #'equal) (hash #'cl-murmurhash:murmurhash)) 100 | "Return an empty hash-set, in which elements will be compared and hashed 101 | with the supplied test and hash functions. The hash must be a 32-bit hash." 102 | (make-instance 'hash-set 103 | :test (ctypecase test 104 | (function test) 105 | (symbol (symbol-function test))) 106 | :hash hash)) 107 | 108 | (defun set-lookup (set x) 109 | "Return true if the object x is in the set, false otherwise" 110 | (with-hamt set (:test test :hash hash :table table) 111 | (%hamt-lookup table x (funcall hash x) 0 test))) 112 | 113 | (defun set-size (set) 114 | "Return the size of the set" 115 | (%hamt-size (hamt-table set))) 116 | 117 | (defun set-insert (set &rest xs) 118 | "Return a new set with the elements xs added to it. Elements already in 119 | the set are ignored." 120 | (with-hamt set (:test test :hash hash :table table) 121 | (flet ((%insert (table x) 122 | (%set-insert table x (funcall hash x) 0 test))) 123 | (make-instance 'hash-set 124 | :test test 125 | :hash hash 126 | :table (reduce #'%insert xs :initial-value table))))) 127 | 128 | (defun set-remove (set &rest xs) 129 | "Return a new set with the elements xs removed from it. If an item x in 130 | xs is not in the set, it is ignored." 131 | (with-hamt set (:test test :hash hash :table table) 132 | (flet ((%remove (table x) 133 | (%hamt-remove table x (funcall hash x) 0 test))) 134 | (make-instance 'hash-set 135 | :test test 136 | :hash hash 137 | :table (reduce #'%remove xs :initial-value table))))) 138 | 139 | (defun set-reduce (func set initial-value) 140 | "Successively apply a function to elements of the set. The function is 141 | assumed to have the signature 142 | `func :: A B -> A`, 143 | where A is the type of `initial-value` and `B` is the type of set elements. 144 | Note that HAMTs do not store items in any order, so the reduction operation 145 | cannot be sensitive to the order in which the items are reduced." 146 | (%hamt-reduce func (hamt-table set) initial-value)) 147 | 148 | (defun set-map (func set 149 | &key 150 | (test nil test-supplied-p) 151 | (hash nil hash-supplied-p)) 152 | "Return the image of a set under a given function. Optionally use new 153 | comparison and hash functions for the mapped set." 154 | (set-reduce (lambda (mapped-set x) 155 | (set-insert mapped-set 156 | (funcall func x))) 157 | set 158 | (empty-set :test (if test-supplied-p test (hamt-test set)) 159 | :hash (if hash-supplied-p hash (hamt-hash set))))) 160 | 161 | (defun set-filter (predicate set) 162 | "Return the elements of the set satisfying a given predicate." 163 | (set-reduce (lambda (filtered-set x) 164 | (if (funcall predicate x) 165 | (set-insert filtered-set x) 166 | filtered-set)) 167 | set 168 | (empty-set :test (hamt-test set) 169 | :hash (hamt-hash set)))) 170 | 171 | (defun set->list (set) 172 | (set-reduce (lambda (lst x) (cons x lst)) 173 | set 174 | '())) 175 | 176 | (defun set-union (set &rest args) 177 | (reduce (lambda (set1 set2) 178 | (set-reduce #'set-insert set1 set2)) 179 | args :initial-value set)) 180 | 181 | (defun set-intersection (set &rest args) 182 | (reduce (lambda (set1 set2) 183 | (set-filter (lambda (x) (set-lookup set1 x)) set2)) 184 | args :initial-value set)) 185 | 186 | (defun set-diff (set &rest args) 187 | (reduce (lambda (set1 set2) 188 | (set-reduce #'set-remove set2 set1)) 189 | args :initial-value set)) 190 | 191 | (defun set-symmetric-diff (set1 set2) 192 | (set-diff (set-union set1 set2) 193 | (set-intersection set1 set2))) 194 | 195 | 196 | ;; Methods for deciding if two sets are equal 197 | (defgeneric %hash-set-eq (set1 set2 test)) 198 | 199 | (defmethod %hash-set-eq (node1 node2 test) 200 | (declare (ignore node1 node2 test)) 201 | nil) 202 | 203 | (defmethod %hash-set-eq ((node1 set-leaf) (node2 set-leaf) test) 204 | (funcall test (node-key node1) (node-key node2))) 205 | 206 | (defmethod %hash-set-eq ((node1 set-conflict) (node2 set-conflict) test) 207 | (and (equal (conflict-hash node1) (conflict-hash node2)) 208 | (tree-equal (conflict-entries node1) (conflict-entries node2) :test test))) 209 | 210 | (defmethod %hash-set-eq ((node1 set-table) (node2 set-table) test) 211 | (and (equal (table-bitmap node1) (table-bitmap node2)) 212 | (array-eq (table-array node1) 213 | (table-array node2) 214 | (lambda (set1 set2) 215 | (%hash-set-eq set1 set2 test))))) 216 | 217 | (defun set-eq (set1 set2) 218 | (let ((test1 (hamt-test set1))) 219 | (if (not (eq test1 (hamt-test set2))) 220 | (error 'incompatible-tests-error) 221 | (%hash-set-eq (hamt-table set1) (hamt-table set2) test1)))) 222 | -------------------------------------------------------------------------------- /src/package.lisp: -------------------------------------------------------------------------------- 1 | 2 | (defpackage #:cl-hamt 3 | (:nicknames #:hamt) 4 | (:use #:cl) 5 | (:export 6 | ;; Functions for operating on dictionaries 7 | #:empty-dict 8 | #:dict-lookup 9 | #:dict-insert 10 | #:dict-remove 11 | #:dict-size 12 | #:dict-reduce 13 | #:dict-map-values 14 | #:dict-map-keys 15 | #:dict-filter 16 | #:dict-eq 17 | 18 | ;; Functions for operating on sets 19 | #:empty-set 20 | #:set-lookup 21 | #:set-insert 22 | #:set-remove 23 | #:set-size 24 | #:set-reduce 25 | #:set-map 26 | #:set-filter 27 | #:set-eq 28 | 29 | ;; Utilities, conversion routines 30 | #:dict-reduce-keys 31 | #:dict-reduce-values 32 | #:dict->alist 33 | #:set->list 34 | 35 | ;; Set theoretic operations 36 | #:set-union 37 | #:set-intersection 38 | #:set-diff 39 | #:set-symmetric-diff 40 | 41 | ;; Hash array-mapped tries 42 | #:hash-dict 43 | #:hash-set 44 | 45 | ;; Conditions 46 | #:incompatible-tests 47 | )) 48 | -------------------------------------------------------------------------------- /src/util.lisp: -------------------------------------------------------------------------------- 1 | (in-package #:cl-hamt) 2 | 3 | ;; Utility functions for operating on HAMTs. 4 | 5 | (defun get-bits (hash depth) 6 | "Extract bits 5*depth : 5*(depth+1) from the number hash." 7 | (declare (type integer hash depth)) 8 | (ldb (byte 5 (* 5 depth)) hash)) 9 | 10 | (defun get-index (bits bitmap) 11 | "Given the 5-bit int extracted from a hash at the present depth, find 12 | the index in the current array corresponding to this bit sequence." 13 | (logcount (ldb (byte bits 0) bitmap))) 14 | 15 | (defun vec-insert (vec pos item) 16 | (let* ((len (1+ (length vec))) 17 | (v (make-array len))) 18 | (loop for i below len do 19 | (setf (aref v i) 20 | (cond 21 | ((< i pos) (aref vec i)) 22 | ((> i pos) (aref vec (1- i))) 23 | (t item)))) 24 | v)) 25 | 26 | (defun vec-remove (vec pos) 27 | (let* ((len (1- (length vec))) 28 | (v (make-array len))) 29 | (loop for i below len do 30 | (setf (aref v i) 31 | (if (< i pos) 32 | (aref vec i) 33 | (aref vec (1+ i))))) 34 | v)) 35 | 36 | (defun vec-update (vec pos item) 37 | (let* ((len (length vec)) 38 | (v (make-array len))) 39 | (loop for i below len do 40 | (setf (aref v i) 41 | (if (= i pos) 42 | item 43 | (aref vec i)))) 44 | v)) 45 | 46 | 47 | -------------------------------------------------------------------------------- /test/benchmarks.lisp: -------------------------------------------------------------------------------- 1 | 2 | (in-package #:cl-hamt-test) 3 | 4 | (defvar state (make-random-state t)) 5 | 6 | (defun random-set (n) 7 | (do ((i 0 (1+ i)) 8 | (set (empty-set) (set-insert set 9 | (random (ash 1 31) state)))) 10 | ((= i n) set))) 11 | 12 | (defmacro get-timing (trials form) 13 | (let ((start (gensym))) 14 | `(let ((,start (get-internal-run-time))) 15 | (progn 16 | (dotimes (,(gensym) ,trials) ,form) 17 | (float (/ (- (get-internal-run-time) ,start) ,trials)))))) 18 | 19 | ;; TODO: test more operations 20 | ;; TODO: find some portable way to force a GC 21 | 22 | (defun asymptotic-runtime (log-max-size &optional (trials 60)) 23 | "Test the asymptotic runtime of HAMT operations by filling a hash-set 24 | with size = 32, 64, ..., 2 ^ log-max-size random numbers, and return a 25 | list of the execution times / size*log(size)." 26 | (loop for k from 5 below log-max-size 27 | collect (let ((n (ash 1 k))) 28 | (/ (get-timing trials (random-set n)) 29 | (* n k))))) 30 | -------------------------------------------------------------------------------- /test/hash-dict-test.lisp: -------------------------------------------------------------------------------- 1 | 2 | (in-package #:cl-hamt-test) 3 | 4 | (def-suite hash-dict-tests) 5 | (in-suite hash-dict-tests) 6 | 7 | (test empty-dict 8 | (is (not (dict-lookup (empty-dict) "hello")))) 9 | 10 | 11 | (defvar pacers (dict-insert (empty-dict) 12 | "Reggie Miller" 2.01 13 | "Larry Bird" 2.06 14 | "Detlef Schrempf" 2.08 15 | "Paul George" 2.01 16 | "Metta World Peace" 2.01)) 17 | 18 | 19 | (test inserting 20 | (multiple-value-bind (val foundp) 21 | (dict-lookup pacers "Detlef Schrempf") 22 | (is (not (null foundp))) 23 | (is (eql val 2.08))) 24 | (is (= 5 (dict-size pacers))) 25 | (is (eql 2.01 (dict-lookup pacers "Reggie Miller")))) 26 | 27 | 28 | (defvar squares 29 | (labels ((f (d i) 30 | (if (= i 16) 31 | d 32 | (f (dict-insert d i (* i i)) (1+ i))))) 33 | (f (empty-dict) 0))) 34 | 35 | (test removing 36 | (is (= 16 (dict-size squares))) 37 | (is (= 16 (dict-lookup squares 4))) 38 | (is-false (dict-lookup (dict-remove squares 4) 4)) 39 | (is (= 15 (dict-size (dict-remove squares 4)))) 40 | (is-true (dict-lookup (dict-remove squares 4) 6)) 41 | (is (= 14 (dict-size (dict-remove squares 4 12)))) 42 | (is (= 9 (dict-lookup (dict-remove squares 4 12) 3))) 43 | 44 | ;; Check that the data structure is persistent, i.e. a functional update 45 | ;; leaves the original HAMT intact 46 | (is-true (let ((fewer-squares (dict-remove squares 4))) 47 | (and (= 16 (dict-lookup squares 4)) 48 | (not (dict-lookup fewer-squares 4)))))) 49 | 50 | 51 | ;; These pairs of strings hash to the same number under murmurhash. 52 | (defvar some-word-collisions 53 | '(("PSYCHOANALYZE" . "BEDUCKS") 54 | ("PANSPERMIES" . "NONSELF") 55 | ("UNSIGHING" . "TURBITS"))) 56 | 57 | ;; Make a HAMT with them so we can see if we've handled collisions right. 58 | (defvar dict-with-collisions 59 | (labels ((f (dict word-pairs) 60 | (if word-pairs 61 | (let ((word1 (caar word-pairs)) 62 | (word2 (cdar word-pairs))) 63 | (f (dict-insert dict 64 | word1 word1 65 | word2 word2) 66 | (cdr word-pairs))) 67 | dict))) 68 | (f (empty-dict :test #'equal 69 | :hash #'cl-murmurhash:murmurhash) 70 | some-word-collisions))) 71 | 72 | (test collisions 73 | (is (equal 6 (dict-size dict-with-collisions))) 74 | 75 | ;; Check that all the keys were inserted properly 76 | (is (equal "PSYCHOANALYZE" 77 | (dict-lookup dict-with-collisions "PSYCHOANALYZE"))) 78 | (is (equal "BEDUCKS" 79 | (dict-lookup dict-with-collisions "BEDUCKS"))) 80 | (is (equal "PANSPERMIES" 81 | (dict-lookup dict-with-collisions "PANSPERMIES"))) 82 | (is (equal "NONSELF" 83 | (dict-lookup dict-with-collisions "NONSELF"))) 84 | (is (equal "UNSIGHING" 85 | (dict-lookup dict-with-collisions "UNSIGHING"))) 86 | (is (not (dict-lookup dict-with-collisions "IRIDOCYCLITIS"))) 87 | 88 | ;; Check that updating into a key with a hash conflict works 89 | (is (equal "KATY PERRY" 90 | (dict-lookup (dict-insert dict-with-collisions 91 | "NONSELF" 92 | "KATY PERRY") 93 | "NONSELF"))) 94 | 95 | ;; Check that removing a key with a hash collision still leaves the key 96 | ;; it originally collided with in the dictionary 97 | (is (equal "PSYCHOANALYZE" 98 | (dict-lookup (dict-remove dict-with-collisions "BEDUCKS") 99 | "PSYCHOANALYZE")))) 100 | 101 | 102 | (defun alist-same-contents-p (alist1 alist2) 103 | (labels ((f (alist) 104 | (if alist 105 | (let ((p (car alist))) 106 | (if (equal p (assoc (car p) alist2)) 107 | (f (cdr alist)) 108 | nil)) 109 | t))) 110 | (and (= (length alist1) (length alist2)) 111 | (f alist1)))) 112 | 113 | (test reduce 114 | (is (= 120 (dict-reduce-keys #'+ squares 0))) 115 | (is (alist-same-contents-p 116 | (dict-reduce (lambda (alist k v) 117 | (acons k v alist)) 118 | squares 119 | '()) 120 | '((0 . 0) (1 . 1) (2 . 4) (3 . 9) (4 . 16) (5 . 25) 121 | (6 . 36) (7 . 49) (8 . 64) (9 . 81) (10 . 100) (11 . 121) 122 | (12 . 144) (13 . 169) (14 . 196) (15 . 225))))) 123 | 124 | (defvar tall-pacers (dict-filter (lambda (k v) 125 | (declare (ignore k)) 126 | (> v 2.01)) 127 | pacers)) 128 | 129 | (test filter 130 | (is (= 2 (dict-size tall-pacers))) 131 | (is-true (dict-lookup tall-pacers "Larry Bird")) 132 | (is-true (dict-lookup tall-pacers "Detlef Schrempf")) 133 | (is-false (dict-lookup tall-pacers "Reggie Miller"))) 134 | 135 | 136 | (defvar capital-dict 137 | (dict-insert (empty-dict) 138 | "Alonzo Church" "LOGIC" 139 | "Sergei Sobolev" "PDE")) 140 | 141 | (defvar lowercase-dict 142 | (dict-insert (empty-dict) 143 | "Alonzo Church" "logic" 144 | "Sergei Sobolev" "pde")) 145 | 146 | (defvar some-numbers 147 | (loop for i from 1 to 100 collect (random 1000))) 148 | 149 | (test dict-equality 150 | (is-true (dict-eq pacers pacers)) 151 | (is-false (dict-eq pacers (dict-remove pacers "Larry Bird"))) 152 | (is-false (dict-eq pacers (dict-insert pacers "Jonathan Bender" 2.13))) 153 | (is-false (dict-eq dict-with-collisions 154 | (dict-remove dict-with-collisions "PSYCHOANALYZE"))) 155 | (is-true (dict-eq capital-dict lowercase-dict :value-test #'equalp)) 156 | (is-false (dict-eq capital-dict lowercase-dict :value-test #'equal)) 157 | (is-true (let ((dict1 (apply #'dict-insert (cons (empty-dict) some-numbers))) 158 | (dict2 (apply #'dict-insert (cons (empty-dict) some-numbers)))) 159 | (and (not (eq dict1 dict2)) 160 | (dict-eq dict1 dict2)))) 161 | (is-true (dict-eq (empty-dict :test 'equal) (empty-dict :test #'equal)))) 162 | -------------------------------------------------------------------------------- /test/hash-set-test.lisp: -------------------------------------------------------------------------------- 1 | (in-package #:cl-hamt-test) 2 | 3 | (def-suite hash-set-tests) 4 | (in-suite hash-set-tests) 5 | 6 | (test empty 7 | (is (= 0 (set-size (empty-set))))) 8 | 9 | (defvar swinging-hepcats 10 | (set-insert (empty-set) 11 | "Louis Armstrong" 12 | "Earl Hines" 13 | "Artie Shaw" 14 | "Count Basie" 15 | "Duke Ellington" 16 | "Coleman Hawkins")) 17 | 18 | (defvar beboppers 19 | (set-insert (empty-set) 20 | "Coleman Hawkins" 21 | "Charlie Parker" 22 | "Dizzy Gillespie" 23 | "Bud Powell" 24 | "Miles Davis")) 25 | 26 | (test inserting 27 | (is (= 6 (set-size swinging-hepcats))) 28 | (is-true (set-lookup swinging-hepcats "Earl Hines")) 29 | (is-false (set-lookup swinging-hepcats "Kenny G"))) 30 | 31 | (test removing 32 | (is-false (set-lookup (set-remove swinging-hepcats 33 | "Coleman Hawkins") 34 | "Coleman Hawkins")) 35 | (is (= 5 (set-size (set-remove swinging-hepcats "Coleman Hawkins"))))) 36 | 37 | 38 | (defun integer-set (n) 39 | (labels ((f (s k) 40 | (if (= k n) 41 | s 42 | (f (set-insert s k) (1+ k))))) 43 | (f (empty-set) 0))) 44 | 45 | (test reducing 46 | (is (= 45 (set-reduce #'+ (integer-set 10) 0)))) 47 | 48 | (defvar hepcats-and-beboppers 49 | (set-filter (lambda (person) 50 | (set-lookup beboppers person)) 51 | swinging-hepcats)) 52 | 53 | (test filtering 54 | (is-true (set-lookup hepcats-and-beboppers "Coleman Hawkins")) 55 | (is (= 1 (set-size hepcats-and-beboppers)))) 56 | 57 | (test mapping 58 | (is-true (set-lookup (set-map (lambda (k) (* k k)) 59 | (integer-set 10)) 60 | 81))) 61 | 62 | 63 | ;; These pairs of strings hash to the same number under murmurhash. 64 | (defvar some-word-collisions 65 | '(("PSYCHOANALYZE" . "BEDUCKS") 66 | ("PANSPERMIES" . "NONSELF") 67 | ("UNSIGHING" . "TURBITS"))) 68 | 69 | (defvar set-with-collisions 70 | (reduce (lambda (s p) 71 | (set-insert s (car p) (cdr p))) 72 | some-word-collisions 73 | :initial-value (empty-set :test #'equal 74 | :hash #'cl-murmurhash:murmurhash))) 75 | 76 | (test collisions 77 | (is (equal 6 (set-size set-with-collisions))) 78 | (is-true (reduce (lambda (correct word) 79 | (and correct 80 | (set-lookup set-with-collisions 81 | word))) 82 | '("PSYCHOANALYZE" 83 | "BEDUCKS" 84 | "PANSPERMIES" 85 | "NONSELF" 86 | "UNSIGHING" 87 | "TURBITS") 88 | :initial-value t)) 89 | (is-true (set-lookup (set-remove set-with-collisions "PSYCHOANALYZE") 90 | "BEDUCKS")) 91 | (is (= 5 (set-size (set-remove set-with-collisions "BEDUCKS"))))) 92 | 93 | (defvar max-number-value 1000) 94 | (defvar some-numbers 95 | (loop for i from 0 to 100 collecting (random max-number-value))) 96 | 97 | (defvar set-without-collisions 98 | (reduce (lambda (s p) 99 | (set-insert s (car p))) 100 | some-word-collisions 101 | :initial-value (empty-set :test #'equal 102 | :hash #'cl-murmurhash:murmurhash))) 103 | 104 | (test set-equality 105 | (is-false (set-eq swinging-hepcats beboppers)) 106 | (is-true (set-eq (set-union swinging-hepcats beboppers) 107 | (set-union beboppers swinging-hepcats))) 108 | (is-true (set-eq set-with-collisions set-with-collisions)) 109 | (is-true (let ((set1 (apply #'set-insert (cons (empty-set) some-numbers))) 110 | (set2 (apply #'set-insert (cons (empty-set) some-numbers)))) 111 | (and (not (eq set1 set2)) 112 | (set-eq set1 set2)))) 113 | (is-false (set-eq (apply 'set-insert (cons (empty-set) some-numbers)) 114 | (apply 'set-insert 115 | (cons (empty-set) 116 | (cons (+ 1 max-number-value) some-numbers))))) 117 | (is-false (set-eq set-with-collisions set-without-collisions)) 118 | (is-true (set-eq (empty-set :test 'equal) (empty-set :test #'equal)))) 119 | -------------------------------------------------------------------------------- /test/package.lisp: -------------------------------------------------------------------------------- 1 | 2 | (defpackage #:cl-hamt-test 3 | (:use #:cl #:cl-hamt #:fiveam) 4 | (:export #:run! 5 | #:hash-set-tests 6 | #:hash-dict-tests 7 | #:asymptotic-runtime)) 8 | --------------------------------------------------------------------------------