├── Bloom.md ├── LICENSE ├── README.md ├── TEMPLATE.md └── ValkeyJSON.md /Bloom.md: -------------------------------------------------------------------------------- 1 | --- 2 | RFC: 4 3 | Status: Accepted 4 | --- 5 | 6 | # ValkeyBloom Module RFC 7 | 8 | ## Abstract 9 | 10 | The proposed feature is ValkeyBloom which is a Rust based Module that brings a native bloom filter data type into Valkey. 11 | 12 | ## Motivation 13 | 14 | Bloom filters are a space efficient probabilistic data structure that can be used to "check" whether an element exists in 15 | a set (with a defined false positive), and to "add" elements to a set. While checking whether an item exists, false positives 16 | are possible, but false negatives are not possible. https://en.wikipedia.org/wiki/Bloom_filter 17 | 18 | To utilize Bloom filters in their client applications, users today use client libraries that are compatible with the ReBloom 19 | including jedis, redis-py, node-redis, nredisstack, rueidis, rustis, etc. This allows customers to perform bloom filter 20 | based operations, e.g. add and set. 21 | 22 | Redis Ltd.‘s ReBloom is published under a proprietery license and hence cannot be distributed freely with ValKey. 23 | 24 | There is growing [demand](https://github.com/orgs/valkey-io/discussions?discussions_q=+bloom+) for an 25 | (1) Open Source bloom filter feature in Valkey which is (2) compatible with the ReBloom API syntax and with 26 | existing ReBloom based client libraries. ValkeyBloom will help address both these requirements. 27 | 28 | 29 | ## Design Considerations 30 | 31 | The ValkeyBloom module brings in a bloom module data type into Valkey and provides commands to create / reserve 32 | bloom filters, operate on them (add items, check if items exist), inspect bloom filters, etc. 33 | It allows customization of properties of bloom filter (capacity, false positive rate, expansion rate, specification of 34 | scaling vs non-scaling filters etc) through commands and configurations. It also allows users to create scalable bloom 35 | filters and back up & restore bloom filters (through RDB load and save). 36 | 37 | ValkeyBloom provides commands (BF.*), configs, etc to operate on Bloom objects which are top-level structures containing 38 | lower-level BloomFilter provided by an external crate. ValkeyBloom utilizes an open source (BSD-2) Bloom Filter Rust library 39 | around which it implements a scalable bloom filter. 40 | 41 | When a bloom filter is created, a bit array is created with a length proportional to the capacity (number of items the user 42 | wants to add to the filter) and hash functions are also created. The number of hash functions are controlled by the 43 | false positive rate that the user configures. 44 | 45 | When a user adds an item (e.g. BF.ADD) to the filter, the item is passed through the hash functions and the corresponding 46 | bits are set to 1. When a user checks whether an item exists on a filter (e.g. BF.EXISTS), the item is passed through the 47 | filters and if all the resolved bits have values as 1, we can say that the item exists with a false positive rate of 48 | X (specified by the user when creating the filter). If any of the bits are 0, the item does not exist and the BF.EXISTS 49 | operation will return 0. 50 | 51 | We have the following terminologies / properties: 52 | * Bloom Object: The top level structure representing the data type. It contains meta data and a list of lower-level 53 | Bloom Filters (Implemented by an external crate) in case of scaling and a single lower-level bloom filter 54 | in case of non scaling. 55 | * Bloom Filter: A single bloom filter (Implemented by an external crate). 56 | * Capacity: The number of items we expect to add to a bloom filter. This controls the size of the filter. 57 | * False Positive Rate: The accuracy the user expects when operating (set / check) on a filter. This controls the number 58 | of hash functions. 59 | * Expansion Rate: This is used in scalable bloom filters where multiple bloom filters are stacked to allow users to 60 | continue using the same bloom object when it reaches capacity by adding another filter of larger 61 | capacity (expansion rate * prev filter capacity). 62 | * Hash Functions: Hash functions used for bit check and set operations underneath. 63 | * Hash Key: Keys used by the hashing function. 64 | 65 | 66 | ### Module OnLoad 67 | 68 | Upon loading, the module registers a new bloom filter module based data type, creates bloom filter (BF.*) commands, 69 | bloom specific configurations and the bloom ACL category. 70 | 71 | * Module name: bf 72 | * Data type name: bloomfltr 73 | * Module shared object file name: valkeybloom.so 74 | 75 | With the Module name as "bf", ValkeyBloom is compatible with ReBloom in its Module name which is accessible by clients 76 | through HELLO, MODULE LIST, and INFO commands. Also, metrics and configs will be prefixed with this name (by design for Modules). 77 | 78 | Regarding the Module Data type name, because ValkeyBloom's Module Data type (the current version) is not compatible with 79 | ReBloom, it is not named same as ReBloom's. We are naming it "bloomfltr" and it is exactly 9 characters as enforced by 80 | core Valkey logic for Module data types. 81 | This will allow us to create a new Module Data Type in the future which can be compatible with the ReBloom and this will 82 | be named the same as ReBloom's Bloom Filter data type. When we do this, we will need to support both Data Types (current 83 | version) and the future version and their names must be unique. 84 | 85 | ### Module Unload 86 | 87 | Once the Module has been loaded, the `MODULE UNLOAD` will be rejected since Module Data type is created on load. 88 | Valkey does not allow unloading of Modules that exports a module data type. 89 | 90 | ``` 91 | 127.0.0.1:6379> MODULE UNLOAD bloom 92 | (error) ERR Error unloading module: the module exports one or more module-side data types, can't unload 93 | ``` 94 | 95 | ### Persistence 96 | 97 | ValkeyBloom implements persistence related Module data type callbacks for the Bloom data type: 98 | 99 | * rdb_save: Serializes bloom objects to RDB. 100 | * rdb_load: Deserializes bloom objects from RDB. 101 | * aof_rewrite: Emits commands into the AOF during the AOF rewriting process. 102 | 103 | ### RDB Save and Load 104 | 105 | During RDB Save of a bloom object, the Module will save the number of filters, expansion rate, false positive rate. 106 | And for every underlying bloom filter in this object, number of hashing functions, number of bits of the bit array, 107 | bytes of the bit array itself. 108 | 109 | We do not save the seed of the hash function used by Bloom Filters in the RDB because the Module uses a fixed seed. 110 | During RDB Load, we restore and re-create the Bloom object using the RDB data and the fixed seed. 111 | The main benefits to using fixed seed are that it reduces the RDB size and it simplifies the RDB save and load. 112 | 113 | ### RDB Compatibility with ReBloom 114 | 115 | ValkeyBloom is not RDB Compatible with ReBloom. 116 | 117 | The meta data that gets written to the RDB is specific to the Module data type's structure and struct members. 118 | Additionally, the data within the underlying bloom filter (from the external crate) is specific to the implementation of 119 | the bloom filter as the hash key (seed), hashing algorithm/s, raw bit array data, etc. can all vary. 120 | 121 | Restoring a bloom filter means that items will need to resolve (through hash functions) to the same indexes of the bit 122 | array. The same hash seed, hashing algorithms, and number of hash functions, bit array will need to be used in order for 123 | previously "added" items to the bloom filter to be resolved through "exists" operations after restoration. 124 | 125 | Because of this, it is not possible to be RDB compatible with ReBloom. 126 | 127 | ### AOF Rewrite handling 128 | 129 | Module data types (including bloom) can implement a callback function that will be triggered for Bloom objects to rewrite 130 | its data as command/s. From the AOF callback, we will handle AOF rewrite by saving a BF.LOAD command with the key, TTL, and 131 | serialized value of the corresponding bloom object. 132 | 133 | ### Migrating workloads from ReBloom: 134 | 135 | Customers that currently use ReBloom can move to ValkeyBloom using two approaches: 136 | 137 | 1. Create the bloom filter objects to have the same properties using BF.RESERVE / BF.INSERT on Valkey (with ValkeyBloom loaded). 138 | Re-populate the bloom filter objects by inserting items by moving the existing bloom workload to the Valkey server 139 | (with ValkeyBloom loaded). The workload can be moved without any errors since we are API Compatible. 140 | 141 | Pros 142 | * This is the simplest option in terms of effort - assuming the user is fine with recreating bloom objects and adding 143 | items on the Valkey server (with ValkeyBloom). 144 | 145 | Cons 146 | * The user will need to re-create the bloom objects (using BF.RESERVE/BF.INSERT) and populate these objects **AFTER** 147 | switching to Valkey (with ValkeyBloom). 148 | 149 | 2. Users can generate an AOF file from a server (that has the ReBloom module loaded) and with an on-going bloom workload 150 | that creates bloom filters & inserts items into them. Next, this can be re-played on a Valkey Server (with ValkeyBloom loaded). 151 | Then, the user can move their existing bloom workload to the Valkey server (with ValkeyBloom loaded). 152 | 153 | Pros 154 | * Bloom objects can be created on the user's existing system (with Rebloom) and the user can also populate the objects 155 | by adding items (BF.ADD/BF.MADD) **BEFORE** switching to Valkey (with ValkeyBloom). 156 | 157 | Cons 158 | * The user will need to manage AOF file generation on the server (with ReBloom) and ensure that the contents do not get 159 | re-written (e.g. as a result of BGREWRITEAOF) into a RDB file. This is because the RDB file with bloom object data 160 | generated by ReBloom is not compatible with ValkeyBloom. The main drawback here is that the AOF can exceed a size 161 | limit and get truncated / re-written. 162 | 163 | ### Memory Management 164 | 165 | On Module Load, the Rust Module overrides the memory allocator that will delegates the allocation and deallocation tasks 166 | to the Valkey server using the existing Module APIs: ValkeyModule_Alloc and ValkeyModule_Free. This API panics if unable 167 | to allocate enough memory. 168 | 169 | The bloom data type also supports memory management related callbacks: 170 | 171 | * free: Frees the bloom filter object when it is deleted, expired or evicted. 172 | * defrag: Supports active defrag for Bloom filter objects 173 | * mem_usage: Reports bytes used by a Bloom object and is used by the MEMORY USAGE command 174 | * copy: Supports a deep copy of bloom filter objects and is used by the COPY command 175 | * free_effort: Determine whether the bloom object's memory needs to be lazy reclaimed or synchronously freed. We return 176 | the number of filters in the bloom object as the free effort and this is similar to how the core handles free_effort 177 | for aggregated objects. 178 | 179 | ### Replication 180 | 181 | Every Bloom Filter based write operation (bloom object creations, scaling, setting of an item on a filter which returns 1 182 | indicating a new entry, etc) will be replicated to replica nodes. Attempts of adding an already existing item to a bloom 183 | object (which returns 0) will not replicated. 184 | 185 | Note: When BF.ADD/BF.MADD/BF.INSERT commands (containing one or more items) are executed on a scalable bloom object which 186 | is at full capacity on primary node, we check whether the item exists or not. This check is based on the configured false 187 | positive rate. If the item is not found, the command results in scaling out by adding a new filter to the bloom object, 188 | adding the item to it, and then replicating the command verbatim to replica nodes. However, the replicated command can 189 | result in a false positive when it checks whether the item exists. In this case, the scale out does not occur on the bloom 190 | object on the replica. This can result in a slight different memory usage between primary and replica nodes which is more 191 | apparent when bloom objects have large filters. 192 | 193 | ### Non Scalable filters 194 | 195 | When non-scaling filters reach their capacity, if a user tries to add items to the bloom object, an error is returned. This 196 | default behavior is based on ReBloom. This helps keep the false positve error rate of the Bloom object to be what the user 197 | requested when creating the bloom object. 198 | 199 | A configuration can be used to provide an alternative behavior of allowing bloom objects to be saturated by allowing add 200 | operations (BF.ADD/BF.MADD/BF.INSERT) to continue without being rejected even when a filter is at full capacity. This will 201 | increase the false positve error rate, but a user can opt into this behavior to allow add operations to "succeed". 202 | 203 | ### Scalable filters 204 | 205 | Bloom Filters can either be configured as scalable (default) or non-scalable through specification with the BF.RESERVE or BF.INSERT command. 206 | 207 | When scaling filters reach their capacity, if a user adds an item to the bloom object, a new bloom filter is created and 208 | added to the list of bloom filters within the same bloom object. This new bloom filter will have a larger capacity 209 | (previous bloom filter's capacity * expansion rate of the bloom object). 210 | 211 | When we want to check whether an item exists on a bloom object (BF.EXISTS/BF.MEXISTS), we look through each filter 212 | (from oldest to newest) in the object's filter list and perform a check operation on each one. Similarly, to add a new 213 | item to the bloom object, we check through all the filters to see if the item already exists and if not, the item is 214 | added to the current filter. 215 | 216 | When a bloom object has a larger number of bloom filters, it will result in reduced performance. 217 | 218 | Additionally, the default expansion rate is 2 and auto scaling of filters is enabled by default. The table below shows 219 | the total capacity (across all filters) when there are x additional (scaled out) filters and the starting filter has a 220 | capacity of 1. 221 | 222 | | x | Capacity with x additional filters | 223 | |-----|---------------------------------------| 224 | | 0 | 1 | 225 | | 1 | 3 | 226 | | 2 | 7 | 227 | | 3 | 15 | 228 | | 4 | 31 | 229 | | 5 | 63 | 230 | | 10 | 2047 | 231 | | 15 | 65535 | 232 | | 20 | 2097151 | 233 | | 25 | 67108863 | 234 | | 30 | 2147483647 | 235 | | 35 | 68719476735 | 236 | | 40 | 2199023255551 | 237 | | 45 | 70368744177663 | 238 | | 50 | 2251799813685247 | 239 | | 55 | 72057594037927934 | 240 | | 60 | 2305843008139952126 | 241 | | 65 | 73786976294838205662 | 242 | | 70 | 2361183241434822608686 | 243 | | 75 | 75557862452714323477986 | 244 | | 100 | 2535301200456458804968143894986 | 245 | 246 | Practically, we can expect scaling to stop well before the 50 range. With the default capacity of 100,000 and the default 247 | expansion rate of 2, the total items inserted (across all filters) would exceed the 64-bit unsigned integer limit before 248 | reaching 50 filters. 249 | 250 | Below are results from an example performance test scenario. We have one BloomObject that is configured with the following 251 | properties: 252 | 253 | * Expansion rate = 1 (Scaling enabled) 254 | * Capacity per filter = 1000 (This will be the same capacity for every subsequent filter due to expansion of 1) 255 | * False positive rate = 0.001 256 | 257 | In the test run, we start up ValkeyServer on a 4 core machine. 258 | 259 | Starting the server (pinned to 2 cores): 260 | 261 | ``` 262 | valkey-server --loadmodule 263 | sudo taskset -cp 0,1 264 | ``` 265 | 266 | Creating the BloomObject & Running the benchmark (pinned to 1 core): 267 | 268 | ``` 269 | 127.0.0.1:6379> bf.reserve key 0.001 1000 expansion 1 270 | # Add enough items to scale out such that N additional filters are added to the bloom object "key". 271 | sudo taskset -c 2 /home/ec2-user/valkey-benchmark -n 1000000 BF.EXISTS key item 272 | ``` 273 | 274 | **Results averaged over 3 runs:** 275 | 276 | | Total Capacity | Filters per Bloom Object | p50 AVG | p95 AVG | p99 AVG | TPS AVG | 277 | |----------------|--------------------------|-----------|-----------|-----------|---------------| 278 | | 1000 | 1 | 0.263 | 0.30833 | 0.559 | 95218.58667 | 279 | | 10000 | 10 | 0.26033 | 0.311 | 0.57233 | 94765.19333 | 280 | | 25000 | 25 | 0.255 | 0.30033 | 0.55367 | 98458.69667 | 281 | | 50000 | 50 | 0.26833 | 0.50833 | 0.831 | 99669.16667 | 282 | | 100000 | 100 | 0.487 | 0.711 | 0.99367 | 83829.30333 | 283 | | 250000 | 250 | 0.96167 | 1.17233 | 1.30033 | 49318.69 | 284 | | 500000 | 500 | 1.991 | 2.31633 | 2.40167 | 25476.69333 | 285 | | 1000000 | 1000 | 4.45233 | 4.76433 | 4.879 | 12034.26667 | 286 | 287 | 288 | ### Choice of Bloom Filter Library 289 | 290 | We evaluated the following libraries: 291 | * https://crates.io/crates/bloomfilter 292 | * https://crates.io/crates/growable-bloom-filter 293 | * https://crates.io/crates/fastbloom 294 | 295 | Here are some important factors when evaluating libraries: 296 | * Is it actively maintained? 297 | * Has there been any breaking changes in serialization and deserialization of bloom filters over version updates? 298 | * Performance 299 | * Popularity (More Crate Downloads / GitHub Stars) 300 | 301 | https://crates.io/crates/bloomfilter was chosen as it provides all the required APIs for controlling a bloom filter. 302 | This library has also been stable through the releases. It uses SipHash as the hashing algorithm ([SipHasher13](https://docs.rs/siphasher/latest/siphasher/sip128/struct.SipHasher13.html)). 303 | 304 | Certain libraries (e.g. fastbloom) are no longer compatible with older versions (after updates) resulting in 305 | serialization / deserialization incompatibility. 306 | 307 | growable-bloom-filter is a bloom filter library that implements a scalable bloom filter. However, it does not allow 308 | control over when the scaling occurs and does not expose APIs to check the number of filters that it has scaled out to. 309 | This crate also has had a breaking change across versions and bloom filters of the older version are not loadable on the 310 | newer versions. 311 | 312 | 313 | ### Large BloomFilter objects 314 | 315 | Create and Delete operations on large bloom filters take longer durations and will block the main thread for this duration. 316 | Because of this, the following operations will be handled differently. 317 | 318 | **defrag callback:** 319 | 320 | If the memory used by any bloom filter within the bloom object is greater than 4 KB (`bloom_large_item_threshold` 321 | constant), we will skip defrag operations on this bloom object. Otherwise, we will defrag the bloom object in iterations 322 | for each bloom filter in the bloom object. 323 | 324 | **free_effort callback:** 325 | 326 | This callback decides the free effort for the bloom object. If it is greater than 4 KB 327 | (`bloom_large_item_threshold` constant), we will return 0 to use async free on the bloom object. 328 | 329 | **write operations (BF.ADD/MADD/INSERT/RESERVE):** 330 | 331 | If the write operation requires creation of a new bloom filter on a particular bloom object, we will compute the memory 332 | usage of the bloom filter that is about to be created (based on capacity and false positive rate). If the memory usage 333 | is greater than 64 MB (`bloom_filter_max_memory_usage` constant), the write operation will be rejected. 334 | 335 | Scalable Bloom filters will grow in used memory after creation of the bloom object - but only as a result of a BF.ADD, BF.MADD, 336 | or BF.INSERT operation and we will reject these requests if it requires a scale out operation that would result in a 337 | creation of a filter greater than the allowed size as explained above. 338 | 339 | It is possible for the user to create their bloom object with an expansion rate of 1. In this case, it is possible that 340 | the bloom object consists of a vector of several bloom filters that are just below the `bloom_filter_max_memory_usage` 341 | threshold. In this case, the Bloom object becomes similar to the Native List data type in that we allow multiple elements 342 | in the list, but we enforce a limit (4 GB for Lists) for each individual element in the list. 343 | 344 | ## Specification 345 | 346 | ### RDB Format 347 | 348 | During RDB save, the Module data type callback is invoked and we save required meta data and bloom filter specific data 349 | across every element in the bloom object's vector of filters. 350 | 351 | ``` 352 | 353 | 354 | 355 | 356 | 357 | 358 | 359 | . 360 | . 361 | . 362 | 363 | 364 | 365 | 366 | 367 | ``` 368 | 369 | ### Bloom Filter Command API 370 | 371 | The following are supported Bloom Filter commands with API syntax compatible with ReBloom: 372 | 373 | **`BF.ADD `** 374 | 375 | This API can be used to add an item to an existing bloom object or to create + add the item. 376 | Response is in the Integer reply format. 377 | 378 | Item does not exist (based on false positve rate) and was successfully to the bloom filter. 379 | If a bloom object named does not exist, the bloom object is created and the item will be added to it. 380 | ``` 381 | (integer) 1 382 | ``` 383 | 384 | Item already exists (based on false positve rate). 385 | ``` 386 | (integer) 0 387 | ``` 388 | 389 | **`BF.EXISTS `** 390 | 391 | This API can be used to check if an item exists in a bloom object. 392 | Response is in the Integer reply format. 393 | 394 | Item exists (based on false positve rate). 395 | ``` 396 | (integer) 1 397 | ``` 398 | 399 | Item does not exist (based on false positve rate). 400 | ``` 401 | (integer) 0 402 | ``` 403 | 404 | **`BF.MADD [ ...]`** 405 | 406 | This API can be used to add item/s to an existing bloom object or to create + add the item/s. 407 | Response is the Array reply format with one or more Integer replies (one for each item argument provided). 408 | 409 | 1 indicates the item does not exist (based on false positve rate) yet and was added successfully to the bloom filter. 410 | If a bloom object named does not exist, the bloom object is created and the item will be added to it. 411 | 0 indicates the item already exists (based on false positve rate). 412 | ``` 413 | (integer) 1 414 | (integer) 1 415 | (integer) 0 416 | ``` 417 | 418 | **`BF.MEXISTS [ ...]`** 419 | 420 | This API can be used to check if item/s exist in a bloom object. 421 | Response is the Array reply format with one or more Integer replies (one for each item argument provided). 422 | 423 | 1 indicates the item exists (based on false positve rate). 0 indicates the item does not exist (based on false positve rate). 424 | ``` 425 | (integer) 1 426 | (integer) 1 427 | (integer) 0 428 | ``` 429 | 430 | **`BF.CARD `** 431 | 432 | This API can be used to check the number of items added to the bloom object (across all the filters). 433 | Response is in the Integer reply format. 434 | ``` 435 | (integer) 20 436 | ``` 437 | 438 | **`BF.INFO [CAPACITY | SIZE | FILTERS | ITEMS | EXPANSION]`** 439 | 440 | The API can be used to get info statistics on the particular bloom object across all its filters. 441 | Response is in an Array reply format with one or more Integer replies (based on whether a specific info stat is provided). 442 | 443 | "Capacity" is the number of items that can be stored on the bloom object across its vector of bloom filters. 444 | 445 | "Size" is the total memory used by the bloom object (including memory allocated for each bloom filter). 446 | 447 | "Filters" is the number of bloom filters that the bloom object has. 1 indicates no scale out has occurred. >1 indicates otherwise. 448 | 449 | "Number of items inserted" is the number of items added to the bloom object across all the filters. 450 | 451 | "Expansion rate" defines the auto scaling behavior. -1 indicates the bloom object is a non scaling. >=1 indicates scaling. 452 | 453 | ``` 454 | 127.0.0.1:6379> BF.INFO bloomobject1 455 | 1) Capacity 456 | 2) (integer) 1000 457 | 3) Size 458 | 4) (integer) 1431 459 | 5) Number of filters 460 | 6) (integer) 1 461 | 7) Number of items inserted 462 | 8) (integer) 1 463 | 9) Expansion rate 464 | 10) (integer) 2 465 | 127.0.0.1:6379> BF.INFO bloomobject1 SIZE 466 | 1) (integer) 1431 467 | 127.0.0.1:6379> BF.INFO bloomobject1 CAPACITY 468 | 1) (integer) 1000 469 | 127.0.0.1:6379> BF.INFO bloomobject1 FILTERS 470 | 1) (integer) 1 471 | ``` 472 | 473 | **`BF.RESERVE [EXPANSION ] | [NONSCALING]`** 474 | 475 | This API is used to create a bloom object with specific properties. 476 | When the command is used, only either EXPANSION or NONSCALING can be used. If both are used, an error is returned. 477 | 478 | "CAPACITY" is the number of items that the users wants to store on the bloom object. For non scaling, this is a limit on 479 | the number of items that can be inserted. For scaling, this is the number of items that can be added after which scaling 480 | occurs. 481 | 482 | "EXPANSION" can be used to create a scaling enabled bloom object with the specifed expansion rate. 483 | 484 | "NONSCALING" can be used to indicate that the bloom object should not auto scale once items are added such that it reaches 485 | full capacity. 486 | 487 | The response is a simple String reply with OK indicating successful creation. 488 | ``` 489 | OK 490 | ``` 491 | 492 | **`BF.INSERT [CAPACITY ] [ERROR ] [EXPANSION ] [NOCREATE] [NONSCALING] ITEMS [ ...]`** 493 | 494 | This API is used to create a bloom object with specific properties and add item/s to it. 495 | 496 | "CAPACITY" is the number of items that the users wants to store on the bloom object. For non scaling, this is a limit on 497 | the number of items that can be inserted. For scaling, this is the number of items that can be added after which scaling 498 | occurs. 499 | 500 | "ERROR" is the false positive error rate. 501 | 502 | "NOCREATE" can be used to specify that the command should not result in creation of a new bloom object if it does not exist. 503 | If NOCREATE is used along with CAPACITY or ERROR, an error is returned. 504 | 505 | "EXPANSION" can be used to create a scaling enabled bloom object with the specifed expansion rate. 506 | 507 | "NONSCALING" can be used to indicate that the bloom object should not auto scale once items are added such that it reaches 508 | full capacity. Only either EXPANSION or NONSCALING can be used. If both are used, an error is returned. 509 | 510 | "ITEMS" can be used to list one or more items to add to the bloom object. 511 | 512 | The response is an array reply with one or more Integer replies. 513 | 1 indicates the item does not exist (based on false positve rate) yet and was added successfully to the bloom filter. 514 | If a bloom object named does not exist, the bloom object is created and the item will be added to it. 515 | 0 indicates the item already exists (based on false positve rate). 516 | ``` 517 | (integer) 1 518 | (integer) 1 519 | (integer) 0 520 | ``` 521 | 522 | The following are NEW commands which are not included in ReBloom: 523 | 524 | **`BF.LOAD `** 525 | 526 | Response is in the Simple String reply format. Returns OK on a successful restoration of a bloom object. 527 | This command is only used during AOF Rewrite and is written into the AOF file to help with restoration. 528 | 529 | ``` 530 | OK 531 | ``` 532 | 533 | Currently following commands (from ReBloom) are not supported: 534 | 535 | **`BF.LOADCHUNK `** 536 | 537 | **`BF.SCANDUMP `** 538 | 539 | The BF.SCANDUMP command is used to perform an incremental save on specific Bloom filter object. 540 | The BF.LOADCHUNK is used to incrementally load / restore a bloom filter object from data from the BF.SCANDUMP command. 541 | 542 | The reason for not implementing these two commands is because the ValkeyBloom Module provides the ability to load and save BloomModule 543 | data type items as part of RDB load and save operations. Additionally, the BF.LOAD command is supported for AOF operations 544 | to re-create the same Bloom object. Because of these reasons, the BF.LOADCHUNK and BF.SCANDUMP APIs will not be supported. 545 | 546 | ### Configurations 547 | 548 | The default properties using which Bloom Filter objects are created can be controlled using configs. The values of the 549 | configs below are only used on a bloom object if the user does not specify the properties explicitly. Example: Using 550 | BF.INSERT or BF.RESERVE can override the default properties. 551 | 552 | Supported Module configurations: 553 | 1. bf.bloom_capacity: Controls the default capacity. When create operations (BF.ADD/MADD) are used, bloom objects created 554 | will use the capacity specified by this config. This controls the number of items that can be added to a bloom filter 555 | object before it is scaled out (scaling) OR before it rejects add operations due to insufficient capacity on the object (nonscaling). 556 | 2. bf.bloom_expansion_rate: Controls the default expansion rate. When create operations (BF.ADD/MADD) are used, bloom 557 | objects created will use the expansion rate specified by this config. This controls the capacity of the new filter 558 | that gets added to the list of filters as a result of scaling. 559 | 3. bf.bloom_fp_rate: Controls the default false positive rate that new bloom objects created (from BF.ADD/MADD) will use. 560 | 561 | ### Constants 562 | 1. bloom_large_item_threshold: Memory usage of a bloom object beyond which bloom objects are exempted from defrag operations 563 | and when deleted, the Module will indicate the object's free_effort as 0 to be async freed. 564 | 2. bloom_filter_max_memory_usage: The maximum memory usage of a particular bloom filter that is allowed. Creation of 565 | bloom filters larger than this will not be allowed. 566 | 567 | ### ACL 568 | 569 | The ValkeyBloom module will introduce a new ACL category - @bloom. 570 | 571 | There are 4 existing ACL categories which are updated to include new BloomFilter commands: @read, @write, @fast, @slow. 572 | 573 | ### Keyspace Event Notification 574 | 575 | Every bloom filter based write command (that involves mutation as explained in the section above) will be made to publish 576 | a keyspace event after the data is mutated. Commands include: BF.RESERVE, BF.ADD, BD.MADD, and BF.INSERT. 577 | 578 | * Event type: VALKEYMODULE_NOTIFY_GENERIC 579 | * Event name: One of the two event names will be published based on the command & scenario: 580 | * bloom.add: Any BF.ADD, BF.MADD, or BF.INSERT command that results in adding item/s to a bloom object. 581 | * bloom.reserve: Any BF.ADD, BF.MADD, BF.INSERT, or BF.RESERVE command that results in creation of a bloom object. 582 | 583 | 584 | Users can subscribe to the bloom events via the standard keyspace event pub/sub. For example, 585 | 586 | ```text 587 | 1. enable keyspace event notifications: 588 | valkey-cli config set notify-keyspace-events KEA 589 | 2. suscribe to keyspace & keyevent event channels: 590 | valkey-cli psubscribe '__key*__:*' 591 | ``` 592 | 593 | ### Info metrics 594 | 595 | Info metrics are visible through the `info bloom` or `info modules` command: 596 | * Number of bloom filter objects 597 | * Total bytes used by bloom filter objects 598 | 599 | In addition to this, we have a specific API (`BF.INFO`) that can be used to list information and stats on the bloom object. 600 | 601 | ## References 602 | * [ValkeyBloom GitHub Issue on the valkey project](https://github.com/valkey-io/valkey/issues/407) 603 | * [ValkeyBloom GitHub Repo](https://github.com/KarthikSubbarao/valkey-bloom) 604 | * [BloomFilter discussions on valkey-io](https://github.com/orgs/valkey-io/discussions?discussions_q=+bloom+) 605 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | BSD 3-Clause License 2 | 3 | Copyright (c) 2024, valkey-io 4 | 5 | Redistribution and use in source and binary forms, with or without 6 | modification, are permitted provided that the following conditions are met: 7 | 8 | 1. Redistributions of source code must retain the above copyright notice, this 9 | list of conditions and the following disclaimer. 10 | 11 | 2. Redistributions in binary form must reproduce the above copyright notice, 12 | this list of conditions and the following disclaimer in the documentation 13 | and/or other materials provided with the distribution. 14 | 15 | 3. Neither the name of the copyright holder nor the names of its 16 | contributors may be used to endorse or promote products derived from 17 | this software without specific prior written permission. 18 | 19 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 20 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 21 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 22 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 23 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 24 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 25 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 26 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 27 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 28 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 29 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | --- 2 | RFC: 1 3 | Status: Informational 4 | --- 5 | 6 | Valkey RFC 7 | ========== 8 | 9 | This repository is a collection of feature proposals and descriptions of changes to Valkey 10 | that require some more detail than just the text in a pull request or an issue. 11 | It is loosely inspired by RFCs and by Python's enhancement proposals (PEP). 12 | 13 | Each feature or larger topic is described in a markdown file that's named in 14 | uppercase and ends in `.md`. These files are not formally numbered, but we use 15 | the pull request number that initially added an RFC to refer to the change. For example, 16 | this description in the README.md file was written in RFC #1. 17 | 18 | Workflow 19 | -------- 20 | 21 | An RFC starts off as a pull request. It's reviewed for formatting, style, 22 | consisteny and content quality. The content shouldn't be very vague or unclear. 23 | Then the proposal is merged. This doesn't mean that the feature is approved for 24 | inclusion in Valkey. It's still just a proposal. 25 | 26 | Each file has one of the following statuses: 27 | 28 | * **Proposed**, meaning the file was added but there's no decision about it yet. 29 | * **Approved**, meaning the core team has made a decision to accept the feature. 30 | * **Rejected**, meaning the core team has made a decision to not accpt the feature. 31 | * **Informational**, for information that is not a feature, like this README file. 32 | 33 | The core team (the Technical Steering Committee) can change the status and make 34 | changes. For larger changes, the PR making the change is mentioned too and can 35 | be referred to by their respective pull-request numbers. 36 | 37 | What's useful to include? 38 | ------------------------- 39 | 40 | The Valkey RFC format is not a strict format, but should include the following 41 | sections unless they are unnecessary for the proposal you are submitting. 42 | 43 | * Status and RFC number (the pull-request number). 44 | * Abstract. A few sentences describing the feature. 45 | * Motivation. What the feature solves and why the existing functionality is not 46 | enough. 47 | * Design considerations. A description of the design constraints and 48 | requirements for the proposal. Comparisons with similar features in other 49 | projects. 50 | * Specification. A more detailed description of the feature, including why 51 | certain details in the design have been chosen. 52 | * Links to related material such as issues, pull requests, papers, or other references. 53 | 54 | Here's an [RFC template](TEMPLATE.md) to get started. 55 | -------------------------------------------------------------------------------- /TEMPLATE.md: -------------------------------------------------------------------------------- 1 | --- 2 | RFC: (PR number) 3 | Status: (Change to Proposed when it's ready for review) 4 | --- 5 | 6 | # Title (Required) 7 | 8 | ## Abstract (Required) 9 | 10 | A few sentences describing the feature. 11 | 12 | ## Motivation (Required) 13 | 14 | What the feature solves and why the existing functionality is not enough. 15 | 16 | ## Design considerations (Required) 17 | 18 | A description of the design constraints and requirements for the proposal, along with comparisons to similar features in other projects. 19 | 20 | ## Specification (Required) 21 | 22 | A more detailed description of the feature, including the reasoning behind the design choices. 23 | 24 | ### Commands (Optional) 25 | 26 | If any new commands are introduced: 27 | 28 | 1. Command name 29 | - **Request** 30 | - **Response** 31 | 32 | ### Authentication and Authorization (Optional) 33 | 34 | If there are any changes around introducing new ACL command/categories for user access control. 35 | 36 | ### Append-only file (Optional) 37 | 38 | If there are any changes around the persistence mechanism of every write operation. 39 | 40 | ### RDB (Optional) 41 | 42 | If there are any changes in snapshotting mechanisms like new data type, version, etc. 43 | 44 | ### Configuration (Optional) 45 | 46 | If there are any configuration changes introduced to enable/disable/modify the behavior of the feature. 47 | 48 | ### Keyspace notifications (Optional) 49 | 50 | If there are any events to be introduced or modified to observe activity around the dataset. 51 | 52 | ### Cluster mode (Optional) 53 | 54 | If there is any special handling for this feature (e.g., client redirection, Sharded PubSub, etc) in cluster mode or if there are any new cluster bus extensions or messages introduced, list out the changes. 55 | 56 | ### Module API (Optional) 57 | 58 | If any new module APIs are needed to implement or support this feature. 59 | 60 | ### Replication (Optional) 61 | 62 | If there are any changes required in the replication mechanism between a primary and replica. 63 | 64 | ### Networking (Optional) 65 | 66 | If there are any changes introduced in the RESP protocol (RESP), client behavior, new server-client interaction mechanism (TCP, RDMA), etc. 67 | 68 | ### Dependencies (Optional) 69 | 70 | If there are any new dependency libraries required to support the feature. Existing dependencies are jemalloc, lua, etc. If the library needs to be vendored into the project, please add supporting reason for it. 71 | 72 | ### Benchmarking (Optional) 73 | 74 | If there are any benchmarks performed and preliminary results (add the hardware/software setup) are available to share or a set of scenarios identified to measure the feature's performance. 75 | 76 | ### Testing (Optional) 77 | 78 | If there are any test scenarios planned to ensure the feature's stability and validate its behavior. 79 | 80 | ### Observability (Optional) 81 | 82 | If there are any new metrics/stats to be introduced to observe behavior or measure the performance of the feature. 83 | 84 | ### Debug mechanism (Optional) 85 | 86 | If there is any debug mechanism introduced to support admin/operators for maintaining the feature. 87 | 88 | ## Appendix (Optional) 89 | 90 | Links to related material such as issues, pull requests, papers, or other references. 91 | -------------------------------------------------------------------------------- /ValkeyJSON.md: -------------------------------------------------------------------------------- 1 | --- 2 | RFC: 2 3 | Status: Accepted 4 | --- 5 | 6 | # ValkeyJSON RFC 7 | 8 | ## Abstract 9 | 10 | The proposed Valkey JSON module, named ValkeyJSON, supports the native JavaScript Object Notation (JSON) format to encode 11 | complex datasets inside Valkey. It should be compliant with [RFC7159](http://www.ietf.org/rfc/rfc7159.txt) and [ECMA-404](https://ecma-international.org/publications-and-standards/standards/ecma-404) 12 | JSON data interchange standard. With this feature, users can natively store, query, and modify JSON data structures in 13 | Valkey using the popular [JSONPath query language](https://www.ietf.org/archive/id/draft-goessner-dispatch-jsonpath-00.html). 14 | To help users migrate from Redis and RedisJSON, as well as capitalize on existing OSS RedisJSON client libraries, the module 15 | is designed to be API-compatible and RDB-compatible with Redis Ltd.’s RedisJSON v2. 16 | 17 | ## Motivation 18 | 19 | JSON format is a widely used data exchange format and simplifies the development of applications that store complex data 20 | structures by providing powerful searching and filtering capabilities. However, [Valkey core](https://github.com/valkey-io/valkey) 21 | does not have a native data type for JSON. Redis Ltd.‘s RedisJSON is a popular Redis module, but not under a true 22 | open source license and hence cannot be distributed freely with Valkey. There's a demand in the Valkey 23 | community to have a JSON module that matches most of the features of RedisJSON and is as API-compatible as possible. 24 | See the community discussions [here](https://github.com/orgs/valkey-io/discussions?discussions_q=is%3Aopen+JSON). 25 | 26 | ## Design Considerations 27 | 28 | ValkeyJSON will introduce a new JSON data type for Valkey, and commands to insert, update, delete and query JSON data. 29 | To help users migrate from Redis and RedisJSON, as well as capitalize on existing OSS RedisJSON client libraries, ValkeyJSON 30 | aims to be a drop-in replacement of RedisJSON. Therefore, it is designed to be API-compatible and RDB-compatible with 31 | Redis Ltd.’s RedisJSON. 32 | 33 | ### RDB Compatibility 34 | 35 | ValkeyJSON is RDB compatible with Redis Ltd.’s RedisJSON v1.0.8 or later. 36 | 37 | ### Choice of JSON Library 38 | 39 | We have evaluated 7 open source JSON libraries - BSON, ION, MessagePack, yyjson, cJSON, Serde JSON, and RapidJSON. BSON, 40 | ION and MessagePack do not have write (insert/update/delete) API and therefore cannot be used for the project. Among the 41 | rest, RapidJSON stands out as both memory efficient and providing efficient insert/update/delete API. 42 | 43 | ### JSONPath Query Parser 44 | 45 | [JSONPath](https://www.ietf.org/archive/id/draft-goessner-dispatch-jsonpath-00.html) is a query language for JSON with 46 | features similar to XPath for XML. JSONPath is used for selecting and extracting elements from a JSON document. It 47 | supports advanced query capabilities such as wildcard, filter expressions, slices, union, recursive search, etc. 48 | Using JSONPath query language, users can natively store, query, and modify JSON data structures, either wholly or partially. 49 | 50 | One of the main deficiencies of the RapidJSON library is that it does not support JSONPath query. RapidJSON only supports 51 | [JSON Pointer](https://datatracker.ietf.org/doc/html/rfc6901) query language, which is restricted to single-value selection. 52 | Single-value selection corresponds to the restricted query capability in RedisJSON v1 API. RedisJSON v2 API supports a 53 | much more powerful JSONPath syntax with wildcard, filter expressions, slices, union, recursive search, etc., and can select 54 | multiple JSON values. RedisJSON v2 API is a superset of v1 API. 55 | 56 | We should extend RapidJSON to support a subset of the JSONPath query language that is compatible with RedisJSON v2 API. 57 | To achieve this, we should implement a JSONPath query parser that integrates with RapidJSON. JSONPath query expressions 58 | should apply to all CRUD operations. 59 | 60 | The query expression should be designed to work with selecting a vector of values instead of a single value, and 61 | compatible with both RedisJSON v1 and v2 query syntax. The query parser should automatically detect if the query is of 62 | v1 or v2 syntax. 63 | 64 | ### Tokenization of JSON Object Keys 65 | 66 | JSON has two container types - Array and Object. The JSON Object is a key/value mapping where keys are restricted to 67 | be strings, whereas the value can be any valid JSON value, including another container value. Thus, a single JSON document 68 | can contain multiple JSON objects, each with its own unique namespace. It's quite common that a key repeatedly appears 69 | in the same JSON document or across documents. If the number of JSON documents is large, repeated key names can consume 70 | a significant amount of memory usage. 71 | 72 | To remove duplicate copies of JSON object key names, the module tokenizes key names by maintaining a global data structure 73 | called KeyTable, which is a sharded hash table storing key tokens and reference counts. Access to the KeyTable is threadsafe. 74 | 75 | ### Document Size Limit 76 | 77 | We added a limit on JSON key size as a good design practice. Without the limit, a malicious program could repeatedly call 78 | "json.arrinsert" or similar commands to make a JSON key grow indefinitely, potentially leading to an out-of-memory (OOM) condition. 79 | 80 | The ValkeyJSON module will provide a configuration option, json.max-document-size, that allows users to set a size limit for 81 | JSON key size. 82 | 83 | However, the default value of json.max-document-size will be set to 0, meaning the size of JSON key will be unlimited. 84 | This decision is based on the observation that RedisJSON, a related project, does not have a built-in size limit, and 85 | the core data types in the Valkey system also do not have such a restriction. 86 | 87 | The amount of memory consumed by a JSON document can be inspected by using the `JSON.DEBUG MEMORY` or `MEMORY USAGE` 88 | command. `JSON.DEBUG MEMORY []` can also report the size of a JSON sub-tree. 89 | 90 | ### Nesting Depth Limit 91 | 92 | When a JSON object or array has an element that is itself another JSON object or array, that inner object or array is 93 | said to “nest” within the outer object or array. To avoid stack overflow, it's good to have a limit on the depth. 94 | ValkeyJSON will have a default path limit of 128 levels, configurable by module config json.max-path-limit. Any attempt 95 | to create a document of deeper than 128 levels will be rejected. 96 | 97 | ## Specification 98 | 99 | ### Supported JSON Standard 100 | 101 | [RFC7159](http://www.ietf.org/rfc/rfc7159.txt) and [ECMA-404](https://ecma-international.org/publications-and-standards/standards/ecma-404) 102 | JSON data interchange standard is supported. UTF-8 Unicode in JSON text is supported. 103 | 104 | ### Root JSON Value 105 | 106 | In earlier RFC 4627, only objects or arrays were allowed as root values of JSON. Since [RFC 7159](http://www.ietf.org/rfc/rfc7159.txt), 107 | the root value of a JSON document can be of any type, scalar type (String, Number, Boolean, Null) or container type (Array, Object). 108 | ValkeyJSON will be compliant with [RFC 7159](http://www.ietf.org/rfc/rfc7159.txt). 109 | 110 | ### JSONPath Query Syntax 111 | 112 | ValkeyJSON supports two kinds of JSONPath query syntaxes: 113 | * Restricted syntax – Has limited query capabilities, compatible with RedisJSON v1. 114 | * Enhanced syntax – Follows the [Goessner-style](https://goessner.net/articles/JsonPath/) JSONPath query syntax, as shown in the table below. 115 | 116 | If a query path starts with '$', the enhanced syntax is used. Otherwise, the restricted syntax is used. A query using 117 | the enhanced syntax always returns an array of values, while a restricted-syntax query always returns a single value. 118 | 119 | Enhanced syntax: 120 | 121 | | Symbol/Expression | Description | 122 | |:------------------|:-------------------------------------------------------------------------| 123 | | $ | the root element | 124 | | . or [] | child operator | 125 | | .. | recursive descent | 126 | | * | wildcard. All elements in an object or array. | 127 | | [] | array subscript operator. Index is 0-based. | 128 | | [,] | union operator | 129 | | [start:end:step] | array slice operator | 130 | | ?() | applies a filter expression to the current array or object | 131 | | @ | used in filter expressions referring to the current node being processed | 132 | | == | equals to, used in filter expressions. | 133 | | != | not equal to, used in filter expressions. | 134 | | > | greater than, used in filter expressions. | 135 | | >= | greater than or equal to, used in filter expressions. | 136 | | < | less than, used in filter expressions. | 137 | | <= | less than or equal to, used in filter expressions. | 138 | | && | logical AND, used to combine multiple filter expressions. | 139 | | || | logical OR, used to combine multiple filter expressions. | 140 | 141 | Examples: 142 | 143 | | JSONPath Expression | Description | 144 | |:--------------------------------------------------------------| :----------- | 145 | | $.store.book[*].author | the authors of all books in the store | 146 | | $..author | all authors | 147 | | $.store.* | all members of the store | 148 | | $["store"].* | all members of the store | 149 | | $.store..price | the price of everything in the store | 150 | | $..* | all recursive members of the JSON structure | 151 | | $..book[*] | all books | 152 | | $..book[0] | the first book | 153 | | $..book[-1] | the last book | 154 | | $..book[0:2] | the first two books | 155 | | $..book[0,1] | the first two books | 156 | | $..book[0:4] | books from index 0 to 3 (ending index is not inclusive) | 157 | | $..book[0:4:2] | books at index 0, 2 | 158 | | $..book[?(@.isbn)] | all books with isbn number | 159 | | $..book[?(@.price<10)] | all books cheaper than $10 | 160 | | '$..book[?(@.price < 10)]' | all books cheaper than $10. (The path must be quoted if it contains whitespaces) | 161 | | '$..book[?(@["price"] < 10)]' | all books cheaper than $10 | 162 | | '$..book[?(@.["price"] < 10)]' | all books cheaper than $10 | 163 | | $..book[?(@.price>=10&&@.price<=100)] | all books in the price range of $10 to $100, inclusive | 164 | | '$..book[?(@.price>=10 && @.price<=100)]' | all books in the price range of $10 to $100, inclusive. (The path must be quoted if it contains whitespaces) | 165 | | $..book[?(@.sold==true||@.in-stock==false)] | all books sold or out of stock | 166 | | '$..book[?(@.sold == true || @.in-stock == false)]' | all books sold or out of stock. (The path must be quoted if it contains whitespaces) | 167 | | '$.store.book[?(@.["category"] == "fiction")]' | all books in the fiction category | 168 | | '$.store.book[?(@.["category"] != "fiction")]' | all books in non-fiction categories | 169 | 170 | ### JSON Command API 171 | 172 | The API is compatible with RedisJSON v2. Note that API compatibility here means our command API is a superset of RedisJSON API. 173 | For example, we have command “JSON.DEBUG DEPTH” and “JSON.DEBUG FIELDS”, while they do not. 174 | 175 | #### JSON.ARRAPPEND 176 | 177 | Append one or more values to the array values at the path. 178 | 179 | ##### Syntax 180 | 181 | ```bash 182 | JSON.ARRAPPEND [json ...] 183 | ``` 184 | 185 | * key - required, JSON key 186 | * path - required, a JSON path 187 | * json - required, JSON value to be appended to the array 188 | 189 | ##### Return 190 | 191 | * If the path is enhanced syntax: 192 | * Array of integers, representing the new length of the array at each path. 193 | * If a value at the path is not an array, its corresponding return value is null. 194 | * SYNTAXERR error if one of the input json arguments is not a valid JSON string. 195 | * NONEXISTENT error if the path does not exist. 196 | 197 | * If the path is restricted syntax: 198 | * Integer, the array's new length. 199 | * If multiple array values are selected, the command returns the new length of the last updated array. 200 | * WRONGTYPE error if the value at the path is not an array. 201 | * SYNTAXERR error if one of the input json arguments is not a valid JSON string. 202 | * NONEXISTENT error if the path does not exist. 203 | 204 | #### JSON.ARRINDEX 205 | 206 | Search for the first occurrence of a scalar JSON value in the arrays at the path. 207 | 208 | * Out of range errors are treated by rounding the index to the array's start and end. 209 | * If start > end, return -1 (not found). 210 | 211 | ##### Syntax 212 | 213 | ```bash 214 | JSON.ARRINDEX [start [end]] 215 | ``` 216 | 217 | * key - required, JSON key. 218 | * path - required, a JSON path. 219 | * json-scalar - required, scalar value to search for. JSON scalar refers to values that are not objects or arrays. 220 | i.e., String, number, boolean and null are scalar values. 221 | * start - optional, start index, inclusive. Defaults to 0 if not provided. 222 | * end - optional, end index, exclusive. Defaults to 0 if not provided, which means the last element is included. 223 | 0 or -1 means the last element is included. 224 | 225 | ##### Return 226 | 227 | * If the path is enhanced syntax: 228 | * Array of integers. Each value is the index of the matching element in the array at the path. The value is -1 if not found. 229 | * If a value is not an array, its corresponding return value is null. 230 | 231 | * If the path is restricted syntax: 232 | * Integer, the index of matching element, or -1 if not found. 233 | * WRONGTYPE error if the value at the path is not an array. 234 | 235 | 236 | #### JSON.ARRINSERT 237 | 238 | Insert one or more values into the array values at path before the index. 239 | 240 | * Inserting at index 0 prepends to the array. 241 | * A negative index values is interpreted as starting from the end. 242 | * The index must be in the array's boundary. 243 | 244 | ##### Syntax 245 | 246 | ```bash 247 | JSON.ARRINSERT [json ...] 248 | ``` 249 | 250 | * key - required, JSON key 251 | * path - required, a JSON path 252 | * index - required, array index before which values are inserted. 253 | * json - required, JSON value to be appended to the array 254 | 255 | ##### Return 256 | 257 | * If the path is restricted syntax: 258 | * Array of integers, representing the new length of the array at each path. 259 | * If a value is an empty array, its corresponding return value is null. 260 | * If a value is not an array, its corresponding return value is null. 261 | * OUTOFBOUNDARIES error if the index argument is out of bounds. 262 | 263 | * If the path is restricted syntax: 264 | * Integer, the new length of the array. 265 | * WRONGTYPE error if the value at the path is not an array. 266 | * OUTOFBOUNDARIES error if the index argument is out of bounds. 267 | 268 | #### JSON.ARRLEN 269 | 270 | Get length of the array values at the path. 271 | 272 | ##### Syntax 273 | 274 | ```bash 275 | JSON.ARRLEN [path] 276 | ``` 277 | 278 | * key - required, JSON key 279 | * path - optional, a JSON path. Defaults to the root path if not provided 280 | 281 | ##### Return 282 | 283 | * If the path is enhanced syntax: 284 | * Array of integers, representing the array length at each path. 285 | * If a value is not an array, its corresponding return value is null. 286 | * Null if the document key does not exist. 287 | 288 | * If the path is restricted syntax: 289 | * Integer, array length. 290 | * If multiple objects are selected, the command returns the first array's length. 291 | * WRONGTYPE error if the value at the path is not an array. 292 | * NONEXISTENT error if the path does not exist. 293 | * Null if the document key does not exist. 294 | 295 | #### JSON.ARRPOP 296 | 297 | Remove and return element at the index from the array. Popping an empty array returns null. 298 | 299 | ##### Syntax 300 | 301 | ```bash 302 | JSON.ARRPOP [path [index]] 303 | ``` 304 | 305 | * key - required, JSON key. 306 | * path - optional, a JSON path. Defaults to the root path if not provided. 307 | * index - optional, position in the array to start popping from. 308 | * Defaults -1 if not provided, which means the last element. 309 | * Negative value means position from the last element. 310 | * Out of boundary indexes are rounded to their respective array boundaries. 311 | 312 | ##### Return 313 | 314 | * If the path is enhanced syntax: 315 | * Array of bulk strings, representing popped values at each path. 316 | * If a value is an empty array, its corresponding return value is null. 317 | * If a value is not an array, its corresponding return value is null. 318 | 319 | * If the path is restricted syntax: 320 | * Bulk string, representing the popped JSON value 321 | * Null if the array is empty. 322 | * WRONGTYPE error if the value at the path is not an array. 323 | 324 | #### JSON.ARRTRIM 325 | 326 | Trim arrays at the path so that it becomes subarray [start, end], both inclusive. 327 | 328 | * If the array is empty, do nothing, return 0. 329 | * If start < 0, treat it as 0. 330 | * If end >= size (size of the array), treat it as size-1. 331 | * If start >= size or start > end, empty the array and return 0. 332 | 333 | ##### Syntax 334 | 335 | ```bash 336 | JSON.ARRTRIM 337 | ``` 338 | 339 | * key - required, JSON key. 340 | * path - required, a JSON path. 341 | * start - required, start index, inclusive. 342 | * end - required, end index, inclusive. 343 | 344 | ##### Return 345 | 346 | * If the path is restricted syntax: 347 | * Array of integers, representing the new length of the array at each path. 348 | * If a value is an empty array, its corresponding return value is null. 349 | * If a value is not an array, its corresponding return value is null. 350 | * OUTOFBOUNDARIES error if an index argument is out of bounds. 351 | 352 | * If the path is restricted syntax: 353 | * Integer, the new length of the array. 354 | * Null if the array is empty. 355 | * WRONGTYPE error if the value at the path is not an array. 356 | * OUTOFBOUNDARIES error if an index argument is out of bounds. 357 | 358 | #### JSON.CLEAR 359 | 360 | Clear the arrays or an objects at the path. 361 | 362 | ##### Syntax 363 | 364 | ```bash 365 | JSON.CLEAR 366 | ``` 367 | 368 | * key - required, JSON key. 369 | * path - optional, a JSON path. Defaults to the root path if not provided. 370 | 371 | ##### Return 372 | 373 | * Integer, the number of containers cleared. 374 | * Clearing an empty array or object accounts for 0 container cleared. 375 | * Clearing a non-container value returns 0. 376 | * If no array or object value is located by the path, the command returns 0. 377 | 378 | #### JSON.DEBUG 379 | 380 | Report information. Supported subcommands are: 381 | 382 | * MEMORY [path] - report memory usage in bytes of a JSON value. Path defaults to the root if not provided. 383 | * DEPTH - report the maximum path depth of the JSON document. 384 | * FIELDS [path] - report the number of fields at the specified document path. Path defaults to the root if not provided. 385 | Each non-container JSON value counts as one field. Objects and arrays recursively count one field for each of their 386 | containing JSON values. Each container value, except the root container, counts as one additional field. 387 | * HELP - print help messages of the command. 388 | 389 | ##### Syntax 390 | 391 | ```bash 392 | JSON.DEBUG 393 | ``` 394 | 395 | ##### Return 396 | 397 | Depends on the subcommand: 398 | 399 | * MEMORY 400 | * If the path is enhanced syntax: 401 | * returns an array of integers, representing memory size (in bytes) of JSON value at each path. 402 | * returns an empty array if the JSON key does not exist. 403 | * If the path is restricted syntax: 404 | * returns an integer, memory size the JSON value in bytes. 405 | * returns null if the JSON key does not exist. 406 | * DEPTH 407 | * returns an integer, the maximum path depth of the JSON document. 408 | * returns null if the JSON key does not exist. 409 | * FIELDS 410 | * If the path is enhanced syntax: 411 | * returns an array of integers, representing number of fields of JSON value at each path. 412 | * returns an empty array if the JSON key does not exist. 413 | * If the path is restricted syntax: 414 | * returns an integer, number of fields of the JSON value. 415 | * returns null if the JSON key does not exist. 416 | * HELP 417 | * returns an array of help messages 418 | 419 | #### JSON.DEL 420 | 421 | Delete the JSON values at the path in a JSON key. If the path is the root path, it is equivalent to deleting 422 | the key from Valkey. 423 | 424 | ##### Syntax 425 | 426 | ```bash 427 | JSON.DEL [path] 428 | ``` 429 | 430 | * key - required, JSON key. 431 | * path - optional, a JSON path. Defaults to the root path if not provided. 432 | 433 | ##### Return 434 | 435 | * Number of elements deleted. 436 | * 0 if the JSON key does not exist. 437 | * 0 if the JSON path is invalid or does not exist. 438 | 439 | #### JSON.FORGET 440 | An alias of JSON.DEL 441 | 442 | #### JSON.GET 443 | 444 | Get the serialized JSON at one or multiple paths. 445 | 446 | ##### Syntax 447 | 448 | ```bash 449 | JSON.GET 450 | [INDENT indentation-string] 451 | [NEWLINE newline-string] 452 | [SPACE space-string] 453 | [NOESCAPE] 454 | [path ...] 455 | ``` 456 | 457 | * key - required, JSON key. 458 | * INDENT/NEWLINE/SPACE - optional, controls the format of the returned JSON string, i.e., "pretty print". The default 459 | value of each one is empty string. They can be overidden in any combination. They can be specified in any order. 460 | * NOESCAPE - optional, allowed to be present for legacy compatibility and has no other effect. 461 | * path - optional, zero or more JSON paths, defaults to the root path if none is given. The path arguments must be 462 | placed at the end. 463 | 464 | ##### Return 465 | 466 | * Enhanced path syntax: 467 | * If one path is given: 468 | * Return serialized string of an array of values. 469 | * If no value is selected, the command returns an empty array. 470 | * If multiple paths are given: 471 | * Return a stringified JSON object, in which each path is a key. 472 | * If there are mixed enhanced and restricted path syntax, the result conforms to the enhanced syntax. 473 | * If a path does not exist, its corresponding value is an empty array. 474 | 475 | * Restricted path syntax: 476 | * If one path is given: 477 | * Return serialized string of the value at the path. 478 | * If multiple values are selected, the command returns the first value. 479 | * If the path does not exist, the command returns NONEXISTENT error. 480 | * If multiple paths are given: 481 | * Return a stringified JSON object, in which each path is a key. 482 | * The result conforms to the restricted path syntax if and only if all paths are restricted paths. 483 | * If a path does not exist, the command returns NONEXISTENT error 484 | 485 | #### JSON.MGET 486 | 487 | Get serialized JSONs at the path from multiple document keys. Return null for non-existent key or JSON path. 488 | 489 | ##### Syntax 490 | 491 | ```bash 492 | JSON.MGET [key ...] 493 | ``` 494 | 495 | * key - required, one or more JSON keys. 496 | * path - required, a JSON path. 497 | 498 | ##### Return 499 | 500 | * Array of Bulk Strings. The size of the array is equal to the number of keys in the command. Each element of the array 501 | is populated with either (a) the serialized JSON as located by the path or (b) Null if the key does not exist or the 502 | path does not exist in the document or the path is invalid (syntax error). 503 | * If any of the specified keys exists and is not a JSON key, the command returns WRONGTYPE error. 504 | 505 | #### JSON.MSET 506 | 507 | Set JSON values for multiple keys. The operation is atomic. Either all values are set or none is set. 508 | 509 | ##### Syntax 510 | 511 | ```bash 512 | JSON.MSET [key ...] 513 | ``` 514 | 515 | * key - required, JSON key 516 | * path - required, a JSON path 517 | * value - JSON value 518 | 519 | ##### Return 520 | 521 | * Simple String 'OK' on success 522 | * Error on failure 523 | 524 | #### JSON.NUMINCRBY 525 | 526 | Increment the number values at the path by a given number. 527 | 528 | ##### Syntax 529 | 530 | ```bash 531 | JSON.NUMINCRBY 532 | ``` 533 | 534 | * key - required, JSON key 535 | * path - required, a JSON path 536 | * number - required, a number 537 | 538 | ##### Return 539 | 540 | * If the path is enhanced syntax: 541 | * Array of bulk Strings representing the resulting value at each path. 542 | * If a value is not a number, its corresponding return value is null. 543 | * WRONGTYPE error if the number cannot be parsed. 544 | * OVERFLOW error if the result is out of the range of double. 545 | * NONEXISTENT if the document key does not exist. 546 | 547 | * If the path is restricted syntax: 548 | * Bulk String representing the resulting value. 549 | * If multiple values are selected, the command returns the result of the last updated value. 550 | * WRONGTYPE error if the value at the path is not a number. 551 | * WRONGTYPE error if the number cannot be parsed. 552 | * OVERFLOW error if the result is out of the range of double. 553 | * NONEXISTENT if the document key does not exist. 554 | 555 | #### JSON.NUMMULTBY 556 | 557 | Multiply the number values at the path by a given number. 558 | 559 | ##### Syntax 560 | 561 | ```bash 562 | JSON.NUMMULTBY 563 | ``` 564 | 565 | * key - required, JSON key 566 | * path - required, a JSON path 567 | * number - required, a number 568 | 569 | ##### Return 570 | 571 | * If the path is enhanced syntax: 572 | * Array of bulk Strings representing the resulting value at each path. 573 | * If a value is not a number, its corresponding return value is null. 574 | * WRONGTYPE error if the number cannot be parsed. 575 | * OVERFLOW error if the result is out of the range of double. 576 | * NONEXISTENT if the document key does not exist. 577 | 578 | * If the path is restricted syntax: 579 | * Bulk String representing the resulting value. 580 | * If multiple values are selected, the command returns the result of the last updated value. 581 | * WRONGTYPE error if the value at the path is not a number. 582 | * WRONGTYPE error if the number cannot be parsed. 583 | * OVERFLOW error if the result is out of the range of double. 584 | * NONEXISTENT if the document key does not exist. 585 | 586 | #### JSON.OBJLEN 587 | 588 | Get number of keys in the object values at the path. 589 | 590 | ##### Syntax 591 | 592 | ```bash 593 | JSON.OBJLEN [path] 594 | ``` 595 | 596 | * key - required, JSON key 597 | * path - optional, a JSON path. Defaults to the root path if not provided 598 | 599 | ##### Return 600 | 601 | * If the path is enhanced syntax: 602 | * Array of integers, representing the object length at each path. 603 | * If a value is not an object, its corresponding return value is null. 604 | * Null if the document key does not exist. 605 | 606 | * If the path is restricted syntax: 607 | * Integer, number of keys in the object. 608 | * If multiple objects are selected, the command returns the first object's length. 609 | * WRONGTYPE error if the value at the path is not an object. 610 | * NONEXISTENT error if the path does not exist. 611 | * Null if the document key does not exist. 612 | 613 | #### JSON.OBJKEYS 614 | 615 | Get key names in the object values at the path. 616 | 617 | ##### Syntax 618 | 619 | ```bash 620 | JSON.OBJKEYS [path] 621 | ``` 622 | 623 | * key - required, JSON key 624 | * path - optional, a JSON path. Defaults to the root path if not provided 625 | 626 | ##### Return 627 | 628 | * If the path is enhanced syntax: 629 | * Array of array of bulk strings. Each element is an array of keys in a matching object. 630 | * If a value is not an object, its corresponding return value is empty value. 631 | * Null if the document key does not exist. 632 | 633 | * If the path is restricted syntax: 634 | * Array of bulk strings. Each element is a key name in the object. 635 | * If multiple objects are selected, the command returns the keys of the first object. 636 | * WRONGTYPE error if the value at the path is not an object. 637 | * NONEXISTENT error if the path does not exist. 638 | * Null if the document key does not exist. 639 | 640 | #### JSON.RESP 641 | 642 | Return the JSON value at the given path in Redis Serialization Protocol (RESP). 643 | If the value is container, the response is RESP array or nested array. 644 | 645 | * JSON null is mapped to the RESP Null Bulk String. 646 | * JSON boolean values are mapped to the respective RESP Simple Strings. 647 | * Integer numbers are mapped to RESP Integers. 648 | * Floating point numbers are mapped to RESP Bulk Strings. 649 | * JSON Strings are mapped to RESP Bulk Strings. 650 | * JSON Arrays are represented as RESP Arrays, where the first element is the simple string [, 651 | followed by the array's elements. 652 | * JSON Objects are represented as RESP Arrays, where the first element is the simple string {, 653 | followed by key-value pairs, each of which is a RESP bulk string. 654 | 655 | ##### Syntax 656 | 657 | ```bash 658 | JSON.RESP [path] 659 | ``` 660 | * key - required, JSON key 661 | * path - optional, a JSON path. Defaults to the root path if not provided 662 | 663 | ##### Return 664 | 665 | * If the path is enhanced syntax: 666 | * Array of arrays. Each array element represents the RESP form of the value at one path. 667 | * Empty array if the document key does not exist. 668 | 669 | * If the path is restricted syntax: 670 | * Array, representing the RESP form of the value at the path. 671 | * Null if the document key does not exist. 672 | 673 | #### JSON.SET 674 | 675 | Set JSON values at the path. 676 | 677 | * If the path calls for an object member: 678 | * If the parent element does not exist, the command will return NONEXISTENT error. 679 | * If the parent element exists but is not an object, the command will return ERROR. 680 | * If the parent element exists and is an object: 681 | * If the member does not exist, a new member will be appended to the parent object if and only if the parent 682 | object is the last child in the path. Otherwise, the command will return NONEXISTENT error. 683 | * If the member exists, its value will be replaced by the JSON value. 684 | * If the path calls for an array index: 685 | * If the parent element does not exist, the command will return a NONEXISTENT error. 686 | * If the parent element exists but is not an array, the command will return ERROR. 687 | * If the parent element exists but the index is out of bounds, the command will return OUTOFBOUNDARIES error. 688 | * If the parent element exists and the index is valid, the element will be replaced by the new JSON value. 689 | * If the path calls for an object or array, the value (object or array) will be replaced by the new JSON value. 690 | 691 | ##### Syntax 692 | 693 | ```bash 694 | JSON.SET [NX | XX] 695 | ``` 696 | 697 | * key - required, JSON key. 698 | * path - required, JSON path. For a new JSON key, the JSON path must be the root path ".". 699 | * json - required, JSON representing the new value 700 | * NX - optional. If the path is the root path, set the value only if the JSON key does not exist, i.e., insert a new document. 701 | If the path is not the root path, set the value only if the path does not exist, i.e., insert a value into the document. 702 | * XX - optional. If the path is the root path, set the value only if the JSON key exists, i.e., replace the existing document. 703 | If the path is not the root path, set the value only if the path exists, i.e., update the existing value. 704 | 705 | ##### Return 706 | 707 | * Simple String 'OK' on success. 708 | * Null if the NX or XX condition is not met. 709 | 710 | #### JSON.STRAPPEND 711 | 712 | Append a string to the JSON strings at the path. 713 | 714 | ##### Syntax 715 | 716 | ```bash 717 | JSON.STRAPPEND [path] 718 | ``` 719 | 720 | * key - required, JSON key. 721 | * path - optional, a JSON path. Defaults to the root path if not provided. 722 | * json_string - required, JSON representation of a string. Note that a JSON string must be quoted, i.e., '"foo"'. 723 | 724 | ##### Return 725 | 726 | * If the path is enhanced syntax: 727 | * Array of integers, representing the new length of the string at each path. 728 | * If a value at the path is not a string, its corresponding return value is null. 729 | * SYNTAXERR error if the input json argument is not a valid JSON string. 730 | * NONEXISTENT error if the path does not exist. 731 | 732 | * If the path is restricted syntax: 733 | * Integer, the string's new length. 734 | * If multiple string values are selected, the command returns the new length of the last updated string. 735 | * WRONGTYPE error if the value at the path is not a string. 736 | * WRONGTYPE error if the input json argument is not a valid JSON string. 737 | * NONEXISTENT error if the path does not exist. 738 | 739 | #### JSON.STRLEN 740 | 741 | Get lengths of the JSON string values at the path. 742 | 743 | ##### Syntax 744 | 745 | ```bash 746 | JSON.STRLEN [path] 747 | ``` 748 | 749 | * key - required, JSON key 750 | * path - optional, a JSON path. Defaults to the root path if not provided 751 | 752 | ##### Return 753 | 754 | * If the path is enhanced syntax: 755 | * Array of integers, representing the length of string value at each path. 756 | * If a value is not a string, its corresponding return value is null. 757 | * Null if the document key does not exist. 758 | 759 | * If the path is restricted syntax: 760 | * Integer, the string's length. 761 | * If multiple string values are selected, the command returns the first string's length. 762 | * WRONGTYPE error if the value at the path is not a string. 763 | * NONEXISTENT error if the path does not exist. 764 | * Null if the document key does not exist. 765 | 766 | #### JSON.TOGGLE 767 | 768 | Toggle boolean values between true and false at the path. 769 | 770 | ##### Syntax 771 | 772 | ```bash 773 | JSON.TOGGLE [path] 774 | ``` 775 | 776 | * key - required, JSON key 777 | * path - optional, a JSON path. Defaults to the root path if not provided 778 | 779 | ##### Return 780 | 781 | * If the path is enhanced syntax: 782 | * Array of integers (0 - false, 1 - true) representing the resulting boolean value at each path. 783 | * If a value is a not boolean, its corresponding return value is null. 784 | * NONEXISTENT if the document key does not exist. 785 | 786 | * If the path is restricted syntax: 787 | * String ("true"/"false") representing the resulting boolean value. 788 | * NONEXISTENT if the document key does not exist. 789 | * WRONGTYPE error if the value at the path is not a boolean. 790 | 791 | #### JSON.TYPE 792 | 793 | Report type of the values at the given path. 794 | 795 | ##### Syntax 796 | 797 | ```bash 798 | JSON.TYPE [path] 799 | ``` 800 | 801 | * key - required, JSON key 802 | * path - optional, a JSON path. Defaults to the root path if not provided 803 | 804 | ##### Return 805 | 806 | * If the path is enhanced syntax: 807 | * Array of strings, representing type of the value at each path. The type is one of {"null", "boolean", "string", "number", "integer", "object" and "array"}. 808 | * If a path does not exist, its corresponding return value is null. 809 | * Empty array if the document key does not exist. 810 | 811 | * If the path is restricted syntax: 812 | * String, type of the value 813 | * Null if the document key does not exist. 814 | * Null if the JSON path is invalid or does not exist. 815 | 816 | ### ACL 817 | 818 | ValkeyJSON introduces a new ACL category - @json. The category includes all JSON commands. No existing Valkey commands 819 | are members of the @json category. 820 | 821 | There are 4 existing ACL categories which are updated to include new JSON commands: @read, @write, @fast, @slow. The 822 | table below has a column for each of these categories and a row for each command. If the cell contains a “y” then that 823 | command must be added into that category. All other command members of those categories remain unchanged. 824 | 825 | | JSON Command | @json | @read | @write | @fast | @slow | 826 | |:---------------|:------|:------|:-------|:------|:------| 827 | | JSON.ARRAPPEND | y | | y | y | | 828 | | JSON.ARRINDEX | y | y | | y | | 829 | | JSON.ARRINSERT | y | | y | y | | 830 | | JSON.ARRLEN | y | y | | y | | 831 | | JSON.ARRPOP | y | | y | y | | 832 | | JSON.ARRTRIM | y | | y | y | | 833 | | JSON.CLEAR | y | | y | y | | 834 | | JSON.DEBUG | y | y | | | y | 835 | | JSON.DEL | y | | y | y | | 836 | | JSON.FORGET | y | | y | y | | 837 | | JSON.GET | y | y | | y | | 838 | | JSON.MGET | y | y | | y | | 839 | | JSON.MSET | y | | y | | y | 840 | | JSON.NUMINCRBY | y | | y | y | | 841 | | JSON.NUMMULTBY | y | | y | y | | 842 | | JSON.OBJKEYS | y | y | | y | | 843 | | JSON.OBJLEN | y | y | | y | | 844 | | JSON.RESP | y | y | | y | | 845 | | JSON.SET | y | | y | | y | 846 | | JSON.STRAPPEND | y | | y | y | | 847 | | JSON.STRLEN | y | y | | y | | 848 | | JSON.TOGGLE | y | | y | y | | 849 | | JSON.TYPE | y | y | | y | | 850 | 851 | ### Info Metrics 852 | 853 | Info metrics are visible through the “info json” or “info modules” command. 854 | 855 | | Info Name | Description | 856 | |:------------------------|:------------------------------------------------------------------| 857 | | json_total_memory_bytes | Total amount of memory allocated to JSON documents and meta data. | 858 | | json_num_documents | Number of JSON keys. | 859 | 860 | ### Module Configs 861 | 862 | | Config Name | Default Value | Unit | Description | 863 | |:-----------------------|:--------------|:-----|:------------------------------------------------------| 864 | | json.max-document-size | 64 | MB | Maximum memory allowed for a single JSON document. | 865 | | josn.max-path-limit | 128 | | Maximum nesting levels within a single JSON document. | 866 | 867 | ### Module API 868 | 869 | ValkeyJSON shall be implemented via the [Valkey modules API](https://valkey.io/topics/modules-intro/). 870 | 871 | #### Module OnLoad 872 | 873 | Upon loading, the module registers a new JSON data type. All operations such as query, insert, update and delete are 874 | efficiently performed on the in-memory document objects, as opposed to JSON text. 875 | 876 | * Module name: json 877 | * JSON data type name: ReJSON-RL (Note: We use the same name as the one in RedisJSON for the sake of RDB compatibility.) 878 | 879 | #### Persistence 880 | 881 | ValkeyJSON hooks into Valkey's persistence API via the module type callbacks: 882 | 883 | * rdb_save: Serializes document objects to RDB. Serialized JSON string is saved in RDB. 884 | * rdb_load: Deserializes document objects from RDB. 885 | * aof_rewrite: Emits commands into the AOF during the AOF rewriting process. 886 | 887 | #### Memory Management 888 | 889 | The JSON data type also supports memory management related callbacks: 890 | 891 | * free: Deallocates a key when it is deleted, expired or evicted. 892 | * defrag: Supports active defrag for JSON keys 893 | * mem_usage: Reports JSON document size (AKA “memory usage” command) 894 | * copy: Supports copy of JSON key 895 | 896 | #### Keyspace Event Notification 897 | 898 | Every JSON write command publishes a keyspace event after the data is mutated. 899 | * Event type: REDISMODULE_NOTIFY_GENERIC 900 | * Event name: command name in lowercase. e.g., json.set command publishes event "json.set". 901 | 902 | Users can subscribe to the JSON events via the standard keyspace event pub/sub. For example, 903 | 904 | ```text 905 | 1. enable keyspace event notifications: 906 | valkey-cli config set notify-keyspace-events KEA 907 | 2. suscribe to keyspace & keyevent event channels: 908 | valkey-cli psubscribe '__key*__:*' 909 | ``` 910 | 911 | #### Replication 912 | 913 | Every JSON write command is replicated to replicas by calling ValkeyModule_ReplicateVerbatim. 914 | 915 | ## References 916 | 917 | * [JSON Command API](https://docs.aws.amazon.com/memorydb/latest/devguide/json-list-commands.html) 918 | * [JSONPath query syntax](https://docs.aws.amazon.com/memorydb/latest/devguide/json-document-overview.html#json-path-syntax) 919 | * JSON blog: [Unlocking JSON workloads with ElastiCache and MemoryDB](https://aws.amazon.com/blogs/database/unlocking-json-workloads-with-elasticache-and-memorydb) 920 | --------------------------------------------------------------------------------