├── .gitignore ├── large-logo.png ├── adr ├── images │ ├── 0049-assisted.png │ ├── 0008-topologies.png │ ├── 0049-aggregate.png │ ├── 0003-jaeger-trace.png │ └── stream-transform.png ├── ADR-52.md ├── ADR-5.md ├── ADR-33.md ├── ADR-11.md ├── ADR-17.md ├── ADR-10.md ├── ADR-9.md ├── ADR-34.md ├── ADR-18.md ├── ADR-12.md ├── ADR-56.md ├── ADR-48.md ├── ADR-55.md ├── ADR-6.md ├── ADR-22.md ├── ADR-35.md ├── ADR-3.md ├── ADR-28.md ├── ADR-43.md ├── ADR-19.md ├── ADR-47.md ├── ADR-7.md ├── ADR-54.md ├── ADR-14.md ├── ADR-4.md ├── ADR-2.md ├── ADR-21.md ├── ADR-30.md ├── ADR-39.md ├── ADR-36.md ├── ADR-44.md ├── ADR-13.md ├── ADR-51.md └── ADR-1.md ├── GOVERNANCE.md ├── go.mod ├── .github └── workflows │ └── validate.yaml ├── .readme.templ ├── adr-template.md ├── go.sum ├── main.go └── LICENSE /.gitignore: -------------------------------------------------------------------------------- 1 | .idea 2 | .DS_Store 3 | -------------------------------------------------------------------------------- /large-logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nats-io/nats-architecture-and-design/HEAD/large-logo.png -------------------------------------------------------------------------------- /adr/images/0049-assisted.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nats-io/nats-architecture-and-design/HEAD/adr/images/0049-assisted.png -------------------------------------------------------------------------------- /adr/images/0008-topologies.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nats-io/nats-architecture-and-design/HEAD/adr/images/0008-topologies.png -------------------------------------------------------------------------------- /adr/images/0049-aggregate.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nats-io/nats-architecture-and-design/HEAD/adr/images/0049-aggregate.png -------------------------------------------------------------------------------- /adr/images/0003-jaeger-trace.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nats-io/nats-architecture-and-design/HEAD/adr/images/0003-jaeger-trace.png -------------------------------------------------------------------------------- /adr/images/stream-transform.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nats-io/nats-architecture-and-design/HEAD/adr/images/stream-transform.png -------------------------------------------------------------------------------- /GOVERNANCE.md: -------------------------------------------------------------------------------- 1 | # NATS Architecture and Design Governance 2 | 3 | NATS Architecture and Design is part of the NATS project and is subject to the [NATS Governance](https://github.com/nats-io/nats-general/blob/master/GOVERNANCE.md). 4 | -------------------------------------------------------------------------------- /go.mod: -------------------------------------------------------------------------------- 1 | module github.com/nats-io/nats-architecture-and-design 2 | 3 | go 1.24 4 | 5 | require gitlab.com/golang-commonmark/markdown v0.0.0-20211110145824-bf3e522c626a 6 | 7 | require ( 8 | gitlab.com/golang-commonmark/html v0.0.0-20191124015941-a22733972181 // indirect 9 | gitlab.com/golang-commonmark/linkify v0.0.0-20200225224916-64bca66f6ad3 // indirect 10 | gitlab.com/golang-commonmark/mdurl v0.0.0-20191124015652-932350d1cb84 // indirect 11 | gitlab.com/golang-commonmark/puny v0.0.0-20191124015043-9f83538fa04f // indirect 12 | golang.org/x/text v0.24.0 // indirect 13 | ) 14 | -------------------------------------------------------------------------------- /.github/workflows/validate.yaml: -------------------------------------------------------------------------------- 1 | name: Testing 2 | on: [pull_request] 3 | 4 | jobs: 5 | lint_and_test: 6 | runs-on: ubuntu-latest 7 | env: 8 | GO111MODULE: "on" 9 | steps: 10 | - name: Checkout code 11 | uses: actions/checkout@v4 12 | with: 13 | ref: ${{ github.ref }} 14 | 15 | - name: Setup Go 16 | uses: actions/setup-go@v5 17 | with: 18 | go-version-file: "go.mod" 19 | 20 | - name: Valid metadata and readme updated 21 | shell: bash --noprofile --norc -x -eo pipefail {0} 22 | run: | 23 | go run main.go > /tmp/readme.new 24 | diff /tmp/readme.new README.md 25 | -------------------------------------------------------------------------------- /adr/ADR-52.md: -------------------------------------------------------------------------------- 1 | # No Headers support for Direct Get 2 | 3 | | Metadata | Value | 4 | |----------|------------| 5 | | Date | 2025-06-19 | 6 | | Author | @ripienaar | 7 | | Status | Deprecated | 8 | | Tags | deprecated | 9 | 10 | ## Deprecated 11 | 12 | This feature was targeted for 2.12 but the outcome was to increase inconsistency in the msg get API while at the same time being a potentially premature optimisation. 13 | 14 | # Context 15 | 16 | Often the only part of a message users care for is the body, a good example is counters introduced in ADR-49 where there is a data section in the body and a control section in the headers. For clients that only care for the current count there is no need to even download the headers from the server. 17 | 18 | We support this feature in both Direct Get and the Message Get API. 19 | 20 | # API Changes 21 | 22 | We add an option to the `JSApiMsgGetRequest` structure as below: 23 | 24 | ```go 25 | type JSApiMsgGetRequest struct { 26 | // .... 27 | 28 | // NoHeaders disable sending any headers with the body payload 29 | NoHeaders bool `json:"no_hdr,omitempty"` 30 | }` 31 | ``` 32 | 33 | When set the server will simply send the body without any headers, headers like `Nats-Sequence` will also be unset. 34 | 35 | This also applies to batch direct gets, where no messages in the batch will have headers. However the final zero payload message will still have the usual batch control headers. -------------------------------------------------------------------------------- /.readme.templ: -------------------------------------------------------------------------------- 1 | ![NATS](large-logo.png) 2 | 3 | # NATS Architecture And Design 4 | 5 | This repository captures Architecture, Design Specifications and Feature Guidance for the NATS ecosystem. 6 | 7 | {{- range . }} 8 | ## {{ .Tag | title }} 9 | 10 | |Index|Tags|Description| 11 | |-----|----|-----------| 12 | {{- range .Adrs }} 13 | |[ADR-{{.Meta.Index}}]({{.Meta.Path}})|{{.Meta.Tags|join}}|{{.Heading}}| 14 | {{- end }} 15 | {{ end }} 16 | ## When to write an ADR 17 | 18 | We use this repository in a few ways: 19 | 20 | 1. Design specifications where a single document captures everything about a feature, examples are ADR-8, ADR-32, ADR-37 and ADR-40 21 | 1. Guidance on conventions and design such as ADR-6 which documents all the valid naming rules 22 | 1. Capturing design that might impact many areas of the system such as ADR-2 23 | 24 | We want to move away from using these to document individual minor decisions, moving instead to spec like documents that are living documents and can change over time. Each capturing revisions and history. 25 | 26 | ## Template 27 | 28 | Please see the [template](adr-template.md). The template body is a guideline. Feel free to add sections as you feel appropriate. Look at the other ADRs for examples. However, the initial Table of metadata and header format is required to match. 29 | 30 | After editing / adding a ADR please run `go run main.go > README.md` to update the embedded index. This will also validate the header part of your ADR. 31 | -------------------------------------------------------------------------------- /adr/ADR-5.md: -------------------------------------------------------------------------------- 1 | # Lame Duck Notification 2 | 3 | |Metadata|Value| 4 | |--------|-----| 5 | |Date |2020-07-20| 6 | |Author |@aricart| 7 | |Status |Partially Implemented| 8 | |Tags |server, client| 9 | 10 | ## Context 11 | 12 | This document describes the _Lame Duck Mode_ server notification. When a server enters lame duck mode, it removes itself from being advertised in the cluster, and slowly starts evicting connected clients as per [`lame_duck_duration`](https://docs.nats.io/nats-server/configuration#runtime-configuration). This document describes how this information is notified 13 | to the client, in order to allow clients to cooperate and initiate an orderly migration to a different server in the cluster. 14 | 15 | 16 | ## Decision 17 | 18 | The server notififies that it has entered _lame duck mode_ by sending an [`INFO`](https://docs.nats.io/nats-protocol/nats-protocol#info) update. If the `ldm` property is set to true, the server has entered _lame_duck_mode_ and the client should initiate an orderly self-disconnect or close. Note the `ldm` property is only available on servers that implement the notification feature. 19 | 20 | ## Consequences 21 | 22 | By becoming aware of a server changing state to _lame duck mode_ clients can orderly disconnect from a server, and connect to a different server. Currently clients have no automatic support to _disconnect_ while keeping current state. Future documentation will describe strategies for initiating a new connection and exiting the old one. 23 | -------------------------------------------------------------------------------- /adr/ADR-33.md: -------------------------------------------------------------------------------- 1 | # Metadata for Stream and Consumer 2 | 3 | |Metadata|Value| 4 | |--------|-----| 5 | |Date |2023-01-23| 6 | |Author |@Jarema| 7 | |Status |Approved| 8 | |Tags |jetstream, client, server| 9 | 10 | ## Context and Problem Statement 11 | 12 | Until now, there was no way to easily add additional information about Stream or Consumer. 13 | The only solution was using `Description` field, which is a not ergonomic workaround. 14 | 15 | ## Server PR 16 | https://github.com/nats-io/nats-server/pull/3797 17 | 18 | ## Design 19 | 20 | The solution is to add new `metadata` field to both `Consumer` and `Stream` config. 21 | The `metadata` field would by a map of `string` keys and `string` values. 22 | 23 | ### JSON representation 24 | The map would be represented in json as object with nested key/value pairs, which is a default 25 | way to marshal maps/hashmaps in most languages. 26 | 27 | ### Size limit 28 | To avoid abuse of the metadata, the size of it is limited to 128KB. 29 | Size is equal to len of all keys and values summed. 30 | 31 | ### Reserved prefix 32 | `_nats` is a reserved prefix. 33 | Will be used for any potential internals of server or clients. 34 | Server can lock its metadata to be immutable and deny any changes. 35 | 36 | 37 | ### Example 38 | ```json 39 | { 40 | "durable_name": "consumer", 41 | ... // other consumer/stream fields 42 | "metadata": { 43 | "owner": "nack", 44 | "domain": "product", 45 | "_nats_created_version": "1.10.0" 46 | } 47 | } 48 | 49 | ``` 50 | 51 | 52 | -------------------------------------------------------------------------------- /adr-template.md: -------------------------------------------------------------------------------- 1 | # Title 2 | 3 | | Metadata | Value | 4 | |----------|------------------------------------------------------------------------------| 5 | | Date | YYYY-MM-DD | 6 | | Author | @, @ | 7 | | Status | `Proposed`, `Approved`, `Partially Implemented`, `Implemented`, `Deprecated` | 8 | | Tags | jetstream, client | 9 | | Updates | ADR-XX in the case of a refinement, else remove | 10 | 11 | | Revision | Date | Author | Info | 12 | |----------|------------|---------|----------------| 13 | | 1 | YYYY-MM-DD | @author | Initial design | 14 | 15 | ## Context and Problem Statement 16 | 17 | [Describe the context and problem statement, e.g., in free form using two to three sentences. You may want to articulate the problem in form of a question.] 18 | 19 | ## [Context | References | Prior Work] 20 | 21 | [What does the reader need to know before the design. These sections and optional, can be separate or combined.] 22 | 23 | ## Design 24 | 25 | [If this is a specification or actual design, write something here.] 26 | 27 | ## Decision 28 | 29 | [Maybe this was just an architectural decision...] 30 | 31 | ## Consequences 32 | 33 | [Any consequences of this design, such as breaking change or Vorpal Bunnies] 34 | -------------------------------------------------------------------------------- /adr/ADR-11.md: -------------------------------------------------------------------------------- 1 | # Hostname resolution 2 | 3 | |Metadata|Value| 4 | |--------|-----| 5 | |Date |2021-07-21| 6 | |Author |@kozlovic| 7 | |Status |Approved| 8 | |Tags |client| 9 | 10 | ## Context and Problem Statement 11 | 12 | The client library should take a random IP address when performing a host name resolution prior to creating the TCP connection. 13 | 14 | ## Prior Work 15 | 16 | The Go client is doing host name resolution as shown [here](https://github.com/nats-io/nats.go/blob/2b2bb8f326dfdd2814ba6d59c59b562354b1af30/nats.go#L1641) 17 | and then shuffle this list (unless the `NoRandomize` option is enabled) as shown [here](https://github.com/nats-io/nats.go/blob/2b2bb8f326dfdd2814ba6d59c59b562354b1af30/nats.go#L1663) 18 | 19 | ## Design 20 | 21 | When the library is about to create a TCP connection, if given a host name (and not an IP), a name resolution must be performed. 22 | 23 | If the list has more than 1 IP returned, it should be randomized, unless the existing `NoRandomize` option is enabled. 24 | We could introduce a new option specific to this IP list as opposed to the server URLs provided by the user. 25 | 26 | Then the connection should happen in the order of the shuffled list and stop as soon as one is successful. 27 | 28 | ## Decision 29 | 30 | This was driven by the fact that the Go client behaves as described above and some users have shown interest in all clients behaving this way. 31 | Some users have DNS where the order almost never change, which with client libraries not performing randomization, would cause all clients 32 | to connect to the same server. 33 | 34 | ## Consequences 35 | 36 | This should be considered as a CHANGE for client libraries, since we are changing the default behavior. 37 | 38 | If it is strongly felt that this new default behavior should have an opt-out, other than the use of the existing `NoRandomize` option, a new option can be introduced to disable this new default behavior. 39 | -------------------------------------------------------------------------------- /go.sum: -------------------------------------------------------------------------------- 1 | github.com/russross/blackfriday/v2 v2.1.0 h1:JIOH55/0cWyOuilr9/qlrm0BSXldqnqwMsf35Ld67mk= 2 | github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM= 3 | gitlab.com/golang-commonmark/html v0.0.0-20191124015941-a22733972181 h1:K+bMSIx9A7mLES1rtG+qKduLIXq40DAzYHtb0XuCukA= 4 | gitlab.com/golang-commonmark/html v0.0.0-20191124015941-a22733972181/go.mod h1:dzYhVIwWCtzPAa4QP98wfB9+mzt33MSmM8wsKiMi2ow= 5 | gitlab.com/golang-commonmark/linkify v0.0.0-20191026162114-a0c2df6c8f82/go.mod h1:Gn+LZmCrhPECMD3SOKlE+BOHwhOYD9j7WT9NUtkCrC8= 6 | gitlab.com/golang-commonmark/linkify v0.0.0-20200225224916-64bca66f6ad3 h1:1Coh5BsUBlXoEJmIEaNzVAWrtg9k7/eJzailMQr1grw= 7 | gitlab.com/golang-commonmark/linkify v0.0.0-20200225224916-64bca66f6ad3/go.mod h1:Gn+LZmCrhPECMD3SOKlE+BOHwhOYD9j7WT9NUtkCrC8= 8 | gitlab.com/golang-commonmark/markdown v0.0.0-20211110145824-bf3e522c626a h1:O85GKETcmnCNAfv4Aym9tepU8OE0NmcZNqPlXcsBKBs= 9 | gitlab.com/golang-commonmark/markdown v0.0.0-20211110145824-bf3e522c626a/go.mod h1:LaSIs30YPGs1H5jwGgPhLzc8vkNc/k0rDX/fEZqiU/M= 10 | gitlab.com/golang-commonmark/mdurl v0.0.0-20191124015652-932350d1cb84 h1:qqjvoVXdWIcZCLPMlzgA7P9FZWdPGPvP/l3ef8GzV6o= 11 | gitlab.com/golang-commonmark/mdurl v0.0.0-20191124015652-932350d1cb84/go.mod h1:IJZ+fdMvbW2qW6htJx7sLJ04FEs4Ldl/MDsJtMKywfw= 12 | gitlab.com/golang-commonmark/puny v0.0.0-20191124015043-9f83538fa04f h1:Wku8eEdeJqIOFHtrfkYUByc4bCaTeA6fL0UJgfEiFMI= 13 | gitlab.com/golang-commonmark/puny v0.0.0-20191124015043-9f83538fa04f/go.mod h1:Tiuhl+njh/JIg0uS/sOJVYi0x2HEa5rc1OAaVsb5tAs= 14 | gitlab.com/opennota/wd v0.0.0-20180912061657-c5d65f63c638 h1:uPZaMiz6Sz0PZs3IZJWpU5qHKGNy///1pacZC9txiUI= 15 | gitlab.com/opennota/wd v0.0.0-20180912061657-c5d65f63c638/go.mod h1:EGRJaqe2eO9XGmFtQCvV3Lm9NLico3UhFwUpCG/+mVU= 16 | golang.org/x/text v0.3.2/go.mod h1:bEr9sfX3Q8Zfm5fL9x+3itogRgK3+ptLWKqgva+5dAk= 17 | golang.org/x/text v0.24.0 h1:dd5Bzh4yt5KYA8f9CJHCP4FB4D51c2c6JvN37xJJkJ0= 18 | golang.org/x/text v0.24.0/go.mod h1:L8rBsPeo2pSS+xqN0d5u2ikmjtmoJbDBT1b7nHvFCdU= 19 | golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= 20 | -------------------------------------------------------------------------------- /adr/ADR-17.md: -------------------------------------------------------------------------------- 1 | # Ordered Consumer 2 | 3 | |Metadata|Value| 4 | |--------|-----| 5 | |Date |2021-09-29| 6 | |Author |@scottf| 7 | |Status |Implemented| 8 | |Tags |jetstream,client| 9 | 10 | ### Context and Problem Statement 11 | 12 | Provide an ordered push subscription for the user that automatically checks and recovers when a gap occurs in the consumer sequence. 13 | The subscription can deliver messages synchronously or asynchronously as normally supported by the client. 14 | 15 | ### Behavior 16 | 17 | The subscription should leverage Gap Management and Auto Status Management to ensure messages are received in the proper order. 18 | 19 | The subscription must track the last good stream and consumer sequences. 20 | When a gap is observed, the subscription closes its current subscription, 21 | releases its consumer and creates a new one starting at the proper stream sequence. 22 | 23 | If hearbeats are missed, consumer might be gone (deleted, lost after reconnect, node restart, etc.), and it should be recreated from last known stream sequence. 24 | 25 | You can optionally make the "state" available to the user. 26 | 27 | ### Subscription Limitations 28 | 29 | The subscription cannot be for 30 | - a pull consumer 31 | - a durable consumer 32 | - cannot bind or be "direct" 33 | 34 | The subscription is not allowed with queues/deliver groups. 35 | 36 | ### Consumer Configuration Checks 37 | 38 | The user can provide a consumer configuration but it must be validated. Error when creating if validation fails. 39 | 40 | Checks: 41 | 42 | - durable_name: must not be provided 43 | - deliver_subject: must not be provided 44 | - ack policy: must not be provided or set to none. Set it to none if it is not provided. 45 | - max_deliver: must not be provided or set to 1. Set it to 1 if it is not provided. 46 | - flow_control: must not be provided or set true. Set it to true if it is not provided. 47 | - mem_storage: must not be provided or set to true. Set to true if it is not provided. 48 | - num_replicas: must not be provided. Set to 1. 49 | 50 | Check and set these settings without an error: 51 | 52 | - idle_heartbeat if not provided, set to 5 seconds 53 | - ack_wait set to something large like 22 hours (Matches the go implementation) 54 | -------------------------------------------------------------------------------- /adr/ADR-10.md: -------------------------------------------------------------------------------- 1 | # JetStream Extended Purge 2 | 3 | |Metadata|Value| 4 | |--------|-----| 5 | |Date |2021-06-30| 6 | |Author |@aricart| 7 | |Status |Implemented| 8 | |Tags |server, client, jetstream| 9 | 10 | ## Context 11 | 12 | JetStream provides the ability to purge streams by sending a request message to: 13 | `$JS.API.STREAM.PURGE.`. The request will return a new message with 14 | the following JSON: 15 | 16 | ```typescript 17 | { 18 | type: "io.nats.jetstream.api.v1.stream_purge_response", 19 | error?: ApiError, 20 | success: boolean, 21 | purged: number 22 | } 23 | ``` 24 | 25 | The `error` field is an [ApiError](ADR-7.md). The `success` field will be set to `true` if the request 26 | succeeded. The `purged` field will be set to the number of messages that were 27 | purged from the stream. 28 | 29 | ## Options 30 | 31 | More fine-grained control over the purge request can be achieved by specifying 32 | additional options as JSON payload. 33 | 34 | ```typescript 35 | { 36 | seq?: number, 37 | keep?: number, 38 | filter?: string 39 | } 40 | ``` 41 | 42 | - `seq` is the optional upper-bound sequence for messages to be deleted 43 | (non-inclusive) 44 | - `keep` is the maximum number of messages to be retained (might be less 45 | depending on whether the specified count is available). 46 | - The options `seq` and `keep` are mutually exclusive. 47 | - `filter` is an optional subject (may include wildcards) to filter on. Only 48 | messages matching the filter will be purged. 49 | - `filter` and `seq` purges all messages matching filter having a sequence 50 | number lower than the value specified. 51 | - `filter` and `keep` purges all messages matching filter keeping at most the 52 | specified number of messages. 53 | - If `seq` or `keep` is specified, but `filter` is not, the stream will 54 | remove/keep the specified number of messages. 55 | - To `keep` _N_ number of messages for multiple subjects, invoke `purge` with 56 | different `filter`s. 57 | - If no options are provided, all messages are purged. 58 | 59 | ## Consequences 60 | 61 | Tooling and services can use this endpoint to remove messages in creative ways. 62 | For example, a stream may contain a number of samples, at periodic intervals a 63 | service can sum them all and replace them with a single aggregate. 64 | -------------------------------------------------------------------------------- /adr/ADR-9.md: -------------------------------------------------------------------------------- 1 | # JetStream Consumer Idle Heartbeats 2 | 3 | |Metadata|Value| 4 | |--------|-----| 5 | |Date |2021-05-12| 6 | |Author |@aricart| 7 | |Status |Approved| 8 | |Tags |server, client, jetstream| 9 | 10 | ## Context 11 | 12 | The JetStream ConsumerConfig option `idle_heartbeat` enables server-side 13 | heartbeats to be sent to the client. To enable the option on the consumer simply 14 | specify it with a value representing the number of nanoseconds that the server 15 | should use as notification interval. 16 | 17 | The server will only notify after the specified interval has elapsed and no new 18 | messages have been delivered to the consumer. Delivering a message to the 19 | consumer resets the interval. 20 | 21 | The idle heartbeats notifications are sent to the consumer's subscription as a 22 | regular NATS message. The message will have a `code` of `100` with a 23 | `description` of `Idle Heartbeat`. The message will contain additional headers 24 | that the client can use to re-affirm that it has not lost any messages: 25 | 26 | - `Nats-Last-Consumer` indicates the last consumer sequence delivered to the 27 | client. If `0`, no messages have been delivered. 28 | - `Nats-Last-Stream` indicates the sequence number of the newest message in the 29 | stream. 30 | 31 | Here's an example of a client creating a consumer with an idle_heartbeat of 10 32 | seconds, followed by a server notification. 33 | 34 | ``` 35 | $JS.API.CONSUMER.CREATE.FRHZZ447RL7NR8TAICHCZ6 _INBOX.FRHZZ447RL7NR8TAICHCQ8.FRHZZ447RL7NR8TAICHDQ0 136␍␊ 36 | {"config":{"ack_policy":"explicit","deliver_subject":"my.messages","idle_heartbeat":10000000000}, 37 | "stream_name":"FRHZZ447RL7NR8TAICHCZ6"}␍␊ 38 | ... 39 | 40 | > HMSG my.messages 2 75 75␍␊NATS/1.0 100 Idle Heartbeat␍␊Nats-Last-Consumer: 0␍␊Nats-Last-Stream: 0␍␊␍␊␍␊ 41 | alive - last stream seq: 0 - last consumer seq: 0 42 | ``` 43 | 44 | This feature is intended as an aid to clients to detect when they have been 45 | disconnected. Without it the consumer's subscription may sit idly waiting for 46 | messages, without knowing that the server might have simply gone away and 47 | recovered elsewhere. 48 | 49 | ## Consequences 50 | 51 | Clients can use this information to set client-side timers that track how many 52 | heartbeats have been missed and perhaps take some action such as re-create a 53 | subscription to resume messages. 54 | -------------------------------------------------------------------------------- /adr/ADR-34.md: -------------------------------------------------------------------------------- 1 | # JetStream Consumers Multiple Filters 2 | 3 | | Metadata | Value | 4 | |----------|---------------------------| 5 | | Date | 2023-01-18 | 6 | | Author | @Jarema | 7 | | Status | Approved | 8 | | Tags | jetstream, client, server | 9 | 10 | ## Context and Problem Statement 11 | 12 | Initially, JetStream Consumers could have only one Filter Subject. 13 | As the number of feature requests to specify multiple subjects increased, this feature was added. 14 | That could also reduce the number of Consumers in general. 15 | 16 | ## Context 17 | 18 | Server PR: https://github.com/nats-io/nats-server/pull/3500 19 | 20 | ## Design 21 | 22 | ### Client side considerations 23 | 24 | To implement the feature without any breaking changes, a new field should be added to Consumer config, both on the server and the clients. 25 | 26 | 1. The new field is: 27 | `FilterSubjects []string json:"filter_subjects"` 28 | 29 | 2. Subjects can't overlap each other and have to fit the interest of the Stream. 30 | In case of overlapping subjects, error (10136) will be returned. 31 | 32 | 3. Only one of `FilterSubject` or `FilterSubjects` can be passed. Passing both results in an error (10134) 33 | 34 | 4. Until future improvements, only old JS API for consumer creation can be used (the one without `FilterSubject`) in consumer create request. Using new API will yield an error (10135). 35 | 36 | 5. Each client, to support this feature, needs to add a new field that Marshals into a json array of strings. 37 | 38 | 6. To ensure compatibility with old servers that are not aware of `filter_subjects` field, client's should check the returned info (from update or create) if the filters are set up properly. 39 | 40 | **Example** 41 | ```json 42 | { 43 | "durable_name": "consumer", 44 | "filter_subjects": ["events", "data"] 45 | } 46 | ``` 47 | 6. Client does not have to check if both, single and multiple filters were passed, as server will validate it. 48 | Client should add new errors to return them in client language idiomatic fashion. 49 | 50 | ### Server side 51 | 52 | To make this change possible and reasonable peformant, server will have a buffer of first message for each subject filtered and will deliver them in order. After delivering message for a subject, buffer for that subject will be filled again, resulting in close to no overhead after the initial buffer fill. 53 | 54 | This can be optimized but will not affect API. 55 | 56 | -------------------------------------------------------------------------------- /adr/ADR-18.md: -------------------------------------------------------------------------------- 1 | # URL support for all client options 2 | 3 | | Metadata | Value | 4 | |----------|-------------| 5 | | Date | 2021-07-21 | 6 | | Author | philpennock | 7 | | Status | Deprecated | 8 | | Tags | deprecated | 9 | 10 | ## Deprecation Note 11 | 12 | We discussed this among client authors and felt that we would not implement this for, among others, these reasons: 13 | 14 | * Server lists holds many server urls but the client has just one set of options, UX allowing options on each seem sub optimal 15 | * These parameters could expose sensitive information in logs and more 16 | * Getting exact parity between all the clients would be hard, we would need to pick specific list of supported options but it would be forever churn to support more on all clients 17 | * It would be difficult to always do the right thing wrt priority, in some cases you would want the URL to set defaults, in others you would want the URL to override settings 18 | 19 | Overall we recognise this would be useful but decided against it for now. 20 | 21 | ## Motivation 22 | 23 | TBD 24 | 25 | ## Overview 26 | 27 | NATS URLs should be able to encode all information required to connect to a NATS server in a useful manner, except perhaps the contents of the CA certificate. This URL encoding should be consistent across all client languages, and be fully documented. 28 | 29 | Making explicit comma-separated lists of URLs, vs of hostnames within a URL, and ensuring that is compatible across all clients is included, plus order randomization. 30 | 31 | Anything tuning connection behavior, which might be used as an option on establishing a connection, should be specifiable in a URL. Anything which doesn't fit into authority information in the URL should probably be "query parameters", `?opt1=foo&opt2=bar` with the documentation establishing the option names and the behavior for unrecognized values for each option. Unrecognized options should be ignored. It's possible that some options should be `#fragopt1=foo&opt2=bar` instead and we should clearly define where we draw the line. Eg, "if it's not sent to the server but is reconnect timing information, it should be `#fragment`" or "for consistency we use `?query` for them all". 32 | 33 | Things which should be configurable include, but are not limited to: 34 | 35 | * OCSP checking status 36 | * JetStream Domain 37 | * Various timeouts 38 | * TLS verification level (if we support anything other than verify always) 39 | * Server TLS cert pinning per `hex(sha256(spki))` 40 | -------------------------------------------------------------------------------- /adr/ADR-12.md: -------------------------------------------------------------------------------- 1 | # JetStream Encryption At Rest 2 | 3 | |Metadata|Value| 4 | |--------|-----| 5 | |Date |2021-07-20| 6 | |Author |@derekcollison| 7 | |Status |Implemented| 8 | |Tags |jetstream| 9 | 10 | ## Context and Problem Statement 11 | 12 | Present a way to have the nats-server itself provide encryption at rest for all stored JetStream data. This should be 13 | seen as an alternative to an encrypted filesystem or block storage that would be provided by a cloud provider. 14 | 15 | End user documentation for this feature can be found in [the official documentation](https://docs.nats.io/jetstream/encryption_at_rest). 16 | 17 | ## Design 18 | 19 | The design will allow a master key to be used to generate keys for asset encryption within JetStream. The "assets" 20 | are *account* information, and metadata and data for *streams* and *consumers*. Even within stream data that can be broken 21 | into multiple message blocks on disk, each block will have a different asset key that will be used to encrypt. 22 | 23 | ## Keys 24 | 25 | * *EK* - External Key, provided to the nats-server configuration or via encryption service in subsequent releases. 26 | * *KEK* - An encryption key derived from the *EK*. Always generated and never stored. This is generated via a PRF where 27 | `KEK := PRF(EK, IV, Context)`, e.g. `HMAC-SHA-256`, `Argon2Id` 28 | * *AEK* - Asset encryption key, encrypted via a *KEK* and placed alongside the asset on disk. 29 | 30 | ## Encryption 31 | 32 | Encryption will utilize an authenticated encryption AEAD. We use ChaCha20-Poly1305. 33 | 34 | ## Future Support 35 | 36 | ### KMS 37 | 38 | In a follow-on release we may choose to configure the nats-servers to access a KMS and not have direct access to the *EK*. 39 | Today the recommended method is to use an environment variable as demonstrated [in the documentation](https://docs.nats.io/jetstream/encryption_at_rest). 40 | 41 | ### Key Rolling 42 | 43 | In the initial release we will have different keys for each message block for a stream, meaning the *KEK* and *AEK* will 44 | not be re-used for any subsequent message blocks. 45 | 46 | However, we may want to introduce the ability to roll keys which could be TBD. For the first release a backup and restore 47 | will be equivalent to changing all of the encryption keys used to encrypt the stream asset. 48 | 49 | We would want to think through rotating the *EK* as well at some point. We would need access to the old one for 50 | retrieving older assets. 51 | 52 | ## Client Side EK 53 | 54 | We could also consider allowing clients to provide this *EK* in a header such that even if the server had a KMS the 55 | client could control this on their own. Of course this requires TLS (which we normally promote) and is not as good as a 56 | KMS or a CP provided solution like encrypted EBS etc. 57 | 58 | -------------------------------------------------------------------------------- /adr/ADR-56.md: -------------------------------------------------------------------------------- 1 | # JetStream Consistency Models 2 | 3 | | Metadata | Value | 4 | |----------|-----------------------------| 5 | | Date | 2025-09-12 | 6 | | Author | @ripienaar, @MauriceVanVeen | 7 | | Status | Approved | 8 | | Tags | server, 2.12 | 9 | 10 | | Revision | Date | Author | Info | 11 | |----------|------------|-----------------------------|---------------------------------------------------| 12 | | 1 | 2025-09-12 | @ripienaar, @MauriceVanVeen | Initial document for R1 `async` persistence model | 13 | 14 | ## Context and Problem Statement 15 | 16 | JetStream is a distributed message persistence system and delivers certain promises when handling user data. 17 | 18 | This document intends to document the models it support, the promises it makes and how to configure the different models. 19 | 20 | > [!NOTE] 21 | > This document is a living document; at present we will only cover the `async` persistence model with an aim to expand in time 22 | > 23 | 24 | ## R1 `async` Persistence Mode 25 | 26 | The `async` persistence mode of a stream will result in asynchronous flushing of data to disk, this result in a significant speed-up as each message will not be written to disk but at the expense of data loss during severe disruptions in power, server or disk subsystems. 27 | 28 | If the server is running with `sync: always` set then that setting will be overridden by this setting for the specific stream. It would not be in `sync: always` mode anymore despite the system wide setting. 29 | 30 | At the moment this mode cannot support batch publishing at all and any attempt to start a batch against a stream in this mode must fail. 31 | 32 | This setting will require API Level 2. 33 | 34 | The interactions between `PersistMode:async` and `sync:always` are as follows: 35 | 36 | * `PersistMode:default`, `sync:always` - all writes are flushed (default) and synced 37 | * `PersistMode:default`, not `sync:always` - all writes are flushed (default), but synced only per sync interval 38 | * `PersistMode:async` - PubAck is essentially returned first, writes are batched in-memory, and the write happens asynchronously in the background 39 | 40 | ### Implications: 41 | 42 | * The Publish Ack will be sent before the data is known to be written to disk 43 | * An increased chance of data loss during any disruption to the server 44 | 45 | ### Configuration: 46 | 47 | * The `PersistMode` key should be unset or `default` for the default strongest possible consistency level 48 | * Setting it on anything other than a R1 stream will result in an error 49 | * Scaling a R1 stream up to greater resiliency levels will fail if the `PersistMode` is not set to `async` 50 | * When the user provides no value for `PersistMode` the implied default is `default` but the server will not set this in the configuration, result of INFO requests will also have it unset 51 | * Setting `PersistMode` to anything other than empty/absent will require API Level 2 52 | 53 | -------------------------------------------------------------------------------- /adr/ADR-48.md: -------------------------------------------------------------------------------- 1 | # TTL Support for Key-Value Buckets 2 | 3 | | Metadata | Value | 4 | |----------|-----------------------------------------| 5 | | Date | 2025-04-09 | 6 | | Author | @ripienaar | 7 | | Status | Implemented | 8 | | Tags | jetstream, client, kv, refinement, 2.11 | 9 | | Updates | ADR-8 | 10 | 11 | 12 | | Revision | Date | Author | Info | 13 | |----------|------------|-----------|----------------------------------| 14 | | 1 | 2025-06-30 | @scottf | Clarify purge and error handling | 15 | 16 | ## Context 17 | 18 | Since NATS Server 2.11 we support [Per-Message TTLs](ADR-43.md), we wish to expose some KV specific features built 19 | on this feature. 20 | 21 | * Improve Watchers by notifying of Max Age deleted messages 22 | * Improve Purge so that old subjects can be permanently removed, removing the need for costly compacts, while still supporting Watchers 23 | * Creating keys with a custom lifetime 24 | 25 | In Key Value, we call these Limit Markers. 26 | 27 | ## Configuration 28 | 29 | Configuration would get a single extra property in a language idiomatic version of `Limit Markers` that will set `allow_msg_ttl` to `true` and `subject_delete_marker_ttl` to the supplied duration. 30 | 31 | This duration value must larger than or equal to 1 second. 32 | 33 | This should only be set on a server with API level 1 or newer. At the moment the only way this is exposed is via the `$JS.API.INFO` API call, clients should check this when this feature is requested. 34 | 35 | The configuration item can be enabled for buckets that have it disabled but should not support disabling it as today the Server would handle old TTLs correctly should it again be enabled later. 36 | 37 | ## Status 38 | 39 | The `Status` interface would get a new property that report on the configured setting: 40 | 41 | ```go 42 | type Status interface { 43 | // LimitMarkerTTL is how long the bucket keeps markers when keys are removed by the TTL setting, 0 meaning markers are not supported 44 | LimitMarkerTTL() time.Duration 45 | 46 | //.... 47 | } 48 | ``` 49 | 50 | ## API Changes 51 | 52 | The functions noted here should support accepting a TTL for the specified api and pass errors on to the user when the server errors because the bucket does not support the feature. 53 | 54 | The basic operation is to add a `Nats-TTL` header to the api request. See [ADR-43](ADR-43.md) for more information. 55 | 56 | ### Storing Values 57 | 58 | The `Create()` function should support accepting a TTL. 59 | 60 | Clients can implement this as a varags version of `Create()`, a configuration option for `Create()` or other idiomatic manner the language supports. 61 | 62 | ### Purging Keys 63 | 64 | The `Purge()` function should support accepting a TTL. 65 | 66 | Clients can implement this as a varags version of `Purge()`, a configuration option for `Purge()` or other idiomatic manner the language supports. 67 | 68 | ### Do Not Support 69 | 70 | At this time, do not accept a TTL for other API. Some are currently undefined, and some are understood to create improper state. For instance a TTL on `Put()` might mean older revisions could come back from the dead once the TTL expires. 71 | 72 | ### Retrieving Values 73 | 74 | When the bucket supports Limit Marker TTLs the clients will receive messages with a header `Nats-Marker-Reason` with these possible values and behaviors: 75 | 76 | | Value | Behavior | 77 | |----------|------------------| 78 | | `MaxAge` | Treat as `PURGE` | 79 | | `Purge` | Treat as `PURGE` | 80 | | `Remove` | Treat as `DEL` | 81 | 82 | Watchers should be updated to handle these values also. 83 | -------------------------------------------------------------------------------- /adr/ADR-55.md: -------------------------------------------------------------------------------- 1 | # Trusted Protocol Aware Proxies 2 | 3 | | Metadata | Value | 4 | |----------|--------------| 5 | | Date | 2025-08-04 | 6 | | Author | @ripienaar | 7 | | Status | Approved | 8 | | Tags | server, 2.12 | 9 | 10 | 11 | ## Context and Problem Statement 12 | 13 | A NATS protocol aware Proxy makes connections to NATS Servers on behalf of Clients and Leafnodes. The proxy would be aware of information the NATS Server could not know in such an arrangement and must securely pass that information to the Server. 14 | 15 | * Information such as Source IP Address 16 | * Information related to TLS connection properties 17 | * Users might be required to connect via Proxies and not direct in their JWT and other user records 18 | 19 | In all these cases it would be required to have a list of trusted Proxy servers to communicate these states. 20 | 21 | ## Server Configuration 22 | 23 | We introduce the concept of trusted Proxies into the server which combine with passing of information or requiring Proxied network path to form a trust relationship. 24 | 25 | A server will be configured as follows: 26 | 27 | ``` 28 | proxies { 29 | trusted = [ 30 | {key: xxxxxx} 31 | ] 32 | } 33 | ``` 34 | 35 | Here we list a number of Proxies we trust using their public nkey as the identifier. 36 | 37 | We will support configuration reload of the `proxies` block, and, should a trusted proxy be removed we will disconnect all connections that came from that proxy. 38 | 39 | When configured this information should be exposed in `VARZ` output. 40 | 41 | ## Proxies Required 42 | 43 | We should be able to require that users must be connecting via proxy, to support this case a `proxies` block must be configured in addition to per-user properties, here are some examples: 44 | 45 | ``` 46 | authorization { 47 | users = [ 48 | {user: deliveries, password: $PASS, proxy_required: true} 49 | ] 50 | } 51 | ``` 52 | 53 | Leafnodes could require the same: 54 | 55 | ``` 56 | leafnodes { 57 | port: ... 58 | authorization { 59 | ... 60 | proxy_required: true 61 | } 62 | } 63 | ``` 64 | 65 | Likewise JWTs will gain a boolean field `ProxyRequired` which will indicate the same requirement. 66 | 67 | When constructing the `INFO` line the NATS Server will: 68 | 69 | * If any `proxies` are configured always include a `nonce` in the `INFO` line 70 | * Always report the server `JSApiLevel` Level `api_lvl` on the `INFO` line 71 | 72 | The NATS Protocol Aware Proxy will then intercept the `INFO` and `CONNECT` protocol lines and inject a `proxy_sig` key into the `CONNECT` line that holds a signature of the same `nonce` in addition to all the client provided `CONNECT` fields. 73 | 74 | The proxy can detect that the server it is connected to is proxy-aware by checking the `api_lvl` being at least `2`. 75 | 76 | While authorizing the connection the server will: 77 | 78 | * If a `proxy_sig` is present verify it is from a known trusted proxy, reject the connection if it is present but invalid 79 | * If Auth Callout is configured, call the script to obtain the user JWT which might set the `ProxyRequired` field 80 | * Checks if the user requires a proxy and reject ones that does not have an associated trusted proxy, or no configured proxies 81 | * The proxy handling a connection is stored with the connection and reported in `CONNZ`, `LEAFZ` and related monitoring endpoints 82 | 83 | The above arrangement means Auth Callout does not need to know which proxies are trusted etc, it simply has to set the boolean in the resulting JWT. 84 | 85 | When rejecting a connection the server will log a message indicating the reason, the `CONNZ` disconnect reason will also reflect that it was due to not accessing via a proxy and the `io.nats.server.advisory.v1.client_disconnect` event will reflect the same reason. The connecting client will get the same nondescript error message as today. -------------------------------------------------------------------------------- /adr/ADR-6.md: -------------------------------------------------------------------------------- 1 | # Naming Rules 2 | 3 | |Metadata|Value| 4 | |--------|-----| 5 | |Date |2021-08-17| 6 | |Author |@scottf| 7 | |Status |Approved| 8 | |Tags |server, client| 9 | 10 | ## Context 11 | 12 | This document describes naming conventions for these protocol components: 13 | 14 | * Subjects (including Reply Subjects) 15 | * Stream Names 16 | * Consumer Names 17 | * Account Names 18 | 19 | ## Prior Work 20 | 21 | Currently, the NATS Docs regarding [protocol convention](https://docs.nats.io/nats-protocol/nats-protocol#protocol-conventions) says this: 22 | 23 | > Subject names, including reply subject (INBOX) names, are case-sensitive and must be non-empty alphanumeric strings with no embedded whitespace. All ascii alphanumeric characters except spaces/tabs and separators which are "." and ">" are allowed. Subject names can be optionally token-delimited using the dot character (.), e.g.: 24 | A subject is comprised of 1 or more tokens. Tokens are separated by "." and can be any non space ascii alphanumeric character. The full wildcard token ">" is only valid as the last token and matches all tokens past that point. A token wildcard, "*" matches any token in the position it was listed. Wildcard tokens should only be used in a wildcard capacity and not part of a literal token. 25 | 26 | > Character Encoding: Subject names should be ascii characters for maximum interoperability. Due to language constraints and performance, some clients may support UTF-8 subject names, as may the server. No guarantees of non-ASCII support are provided. 27 | 28 | ## Specification 29 | 30 | ``` 31 | dot = "." 32 | asterisk = "*" 33 | lt = "<" 34 | gt = ">" 35 | dollar = "$" 36 | colon = ":" 37 | double-quote = ["] 38 | fwd-slash = "/" 39 | backslash = "\" 40 | pipe = "|" 41 | question-mark = "?" 42 | ampersand = "&" 43 | dash = "-" 44 | underscore = "_" 45 | equals = "=" 46 | printable = all printable ascii (33 to 126 inclusive) 47 | term = (printable except dot, asterisk or gt)+ 48 | limited-term = (A-Z, a-z, 0-9, dash, underscore, fwd-slash, equals)+ 49 | limited-term-w-sp = (A-Z, a-z, 0-9, dash, underscore, fwd-slash, equals, space)+ 50 | restricted-term = (A-Z, a-z, 0-9, dash, underscore)+ 51 | prefix = (printable except dot, asterisk, gt or dollar)+ 52 | filename-safe = (printable except dot, asterisk, gt, fwd-slash, backslash)+ maximum 255 characters 53 | 54 | message-subject = term (dot term | asterisk)* (dot gt)? 55 | reply-to = term (dot term)* 56 | stream-name = filename-safe 57 | durable-name = filename-safe 58 | consumer-name = filename-safe 59 | account-name = filename-safe 60 | queue-name = term 61 | js-internal-prefix = dollar (prefix dot)+ 62 | js-user-prefix = (prefix dot)+ 63 | kv-key-name = limited-term (dot limited-term)* 64 | kv-bucket-name = restricted-term 65 | os-bucket-name = restricted-term 66 | os-object-name = (any-character)+ 67 | ``` 68 | 69 | ## Notes 70 | 71 | ### filename-safe 72 | 73 | The `filename-safe` is designed with unix operating systems in mind. 74 | If the server is running on Windows, the `filename-safe` is too lenient and will operate like: 75 | 76 | ``` 77 | filename-safe = (printable except dot, asterisk, lt, gt, colon, double-quote, fwd-slash, backslash, pipe, question-mark)+ maximum 255 characters 78 | ``` 79 | 80 | The server will reject names (return an API error) when it cannot build a valid path. 81 | 82 | ### kv-key-name 83 | 84 | Keys starting with `_kv` are limited to internal use. 85 | 86 | ### Client Validation 87 | 88 | This note is simply to capture that currently the Go client simply validates that stream and consumer/durable names are not empty, 89 | and do not contain `.`. Any other constraints are left to the server. NATS cli, performs additional validations, `.`, `*`, `>`, `/` and `\` 90 | are not allowed. 91 | -------------------------------------------------------------------------------- /adr/ADR-22.md: -------------------------------------------------------------------------------- 1 | # JetStream Publish Retries on No Responders 2 | 3 | | Metadata | Value | 4 | |----------|---------------------------| 5 | | Date | 2022-03-18 | 6 | | Author | wallyqs | 7 | | Status | Partially Implemented | 8 | | Tags | jetstream, client | 9 | 10 | ## Motivation 11 | 12 | When the NATS Server is running with JetStream on cluster mode, there 13 | can be occasional blips in leadership which can result in a number 14 | of `no responders available` errors during the election. In order to 15 | try to mitigate these failures, retries can be added into JetStream 16 | enabled clients to attempt to publish the message to JetStream once it 17 | is ready again. 18 | 19 | ## Implementation 20 | 21 | A `no responders available` error uses the 503 status header to signal 22 | a client that there was no one available to serve the published 23 | request. A synchronous `Publish` request when using the JetStream 24 | context internally uses a `Request` to produce a message and if the 25 | JetStream service was not ready at the moment of publishing, the 26 | server will send to the requestor a 503 status message right away. 27 | 28 | To improve robustness of producing messages to JetStream, a client can 29 | back off for a a bit and then try to send the message again later. 30 | By default, the Go client waits for `250ms` and will retry 2 times 31 | sending the message (so that in total it would have attempted to send 32 | the message 3 times). 33 | 34 | Below can be found an example implementation using the `Request` API 35 | from the Go client: 36 | 37 | ```go 38 | // Stream that persists messages sent to 'foo' 39 | js.AddStream(&nats.StreamConfig{Name: "foo"}) 40 | 41 | var ( 42 | retryWait = 250 * time.Millisecond 43 | maxAttempts = 2 44 | i = 0 45 | ) 46 | 47 | // Loop to publish a message every 100ms 48 | for range time.NewTicker(100 * time.Millisecond).C { 49 | subject := "foo" 50 | msg := fmt.Sprintf("i:%d", i) 51 | _, err := nc.Request(subject, []byte(msg), 1*time.Second) 52 | if err != nil && err == nats.ErrNoResponders { 53 | for attempts := 0; attempts < maxAttempts; attempts++ { 54 | // Backoff before retrying 55 | time.Sleep(retryWait) 56 | 57 | // Next attempt 58 | _, err := nc.Request(subject, []byte(msg), 1*time.Second) 59 | if err != nil && err == nats.ErrNoResponders { 60 | // Retry again 61 | continue 62 | } 63 | } 64 | } 65 | i++ 66 | } 67 | ``` 68 | 69 | ## Errors 70 | 71 | After exhausting the number of attempts, the result should either be a timeout error 72 | in case the deadline expired or a `nats: no response from stream` error 73 | if the error from the last attempt was still a `no responders error`. 74 | 75 | ## Examples 76 | 77 | ### Customizing retries with `RetryWait` and `RetryAttempts` 78 | 79 | Two options are added to customize the retry logic from the defaults: 80 | 81 | ```go 82 | _, err := js.Publish("foo", []byte("bar"), nats.RetryWait(250*time.Millisecond), nats.RetryAttempts(10)) 83 | if err != nil { 84 | log.Println("Pub Error", err) 85 | } 86 | ``` 87 | 88 | ### Make Publish retry as needed until deadline 89 | 90 | It can be possible to set the maximum deadline of the retries so that the client can retry as needed. 91 | In the example below a client will attempt to publish up to 10 seconds to wait for an ack response 92 | from the server, backing off `250ms` as needed until the service is available again: 93 | 94 | ```go 95 | // Using Go context package 96 | ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) 97 | defer cancel() 98 | _, err := js.Publish("foo", []byte("bar"), nats.Context(ctx), nats.RetryWait(250*time.Millisecond), nats.RetryAttempts(-1)) 99 | if err != nil { 100 | log.Println("Pub Error", err) 101 | 102 | } 103 | 104 | // Custom AckWait 105 | _, err := js.Publish("foo", []byte("bar"), nats.AckWait(10*time.Second), nats.RetryWait(250*time.Millisecond), nats.RetryAttempts(-1)) 106 | if err != nil { 107 | log.Println("Pub Error", err) 108 | } 109 | ``` 110 | -------------------------------------------------------------------------------- /adr/ADR-35.md: -------------------------------------------------------------------------------- 1 | # JetStream Filestore Compression 2 | 3 | | Metadata | Value | 4 | |----------|---------------------------| 5 | | Date | 2023-05-01 | 6 | | Author | @neilalexander | 7 | | Status | Implemented | 8 | | Tags | jetstream, client, server | 9 | 10 | ## Context and Problem Statement 11 | 12 | Use of filestore encryption can almost completely prevent host filesystem compression or deduplication from working effectively. This may present a particular problem in environments where encryption is mandated for compliance reasons but local storage is either limited or expensive. Having the ability for the NATS Server to compress the message block content before encryption takes place can help in this area. 13 | 14 | ## References 15 | 16 | Compression and decompression of messages is performed transparently by the NATS Server if configured to do so, therefore clients do not need to be modified in order to publish to or consume messages from a stream. However, clients will need to be modified in order to be able to configure or inspect the compression on a stream. 17 | 18 | - Server PRs: 19 | - 20 | - 21 | - JetStream schema: 22 | - 23 | - NATS CLI: 24 | - 25 | 26 | ## Design 27 | 28 | The stream configuration will gain a new optional `"compression"` field. If supplied, the following values are valid: 29 | 30 | - `"none"` — No compression is enabled on the stream 31 | - `"s2"` — S2 compression is enabled on the stream 32 | 33 | This field can be provided when creating a stream with `$JS.API.STREAM.CREATE`, updating a stream with `$JS.API.STREAM.UPDATE` and it will be returned when requesting the stream info with `$JS.API.STREAM.INFO`. 34 | 35 | When enabled, message blocks will be compressed asynchronously when they cease to be the tail block — that is, at the point that the message block reaches the maximum configured block size and a new block is created. This is to prevent unnecessary decompression and recompression of the tail block while it is still being written to, which would reduce publish throughput. 36 | 37 | Compaction and truncation operations will also compress/decompress any relevant blocks synchronously as required. 38 | 39 | Compressed blocks gain a new prepended header describing not only the compression algorithm in use but also the original block content size. This header is encrypted along with the rest of the block when filestore encryption is enabled. Absence of this header implies that the block is not compressed and the NATS Server will not ordinarily prepend a header to an uncompressed block. The presence of the original block content size within the header makes it possible to determine the effective compression ratio later without having to decompress the block, although the NATS Server does not currently do this. 40 | 41 | The checksum at the end of the block is specifically excluded from compression and remains on disk as-is, so that checking the block integrity does not require decompressing the entire block. 42 | 43 | ## Decision 44 | 45 | The design is such that different compression algorithms can easily be implemented within the NATS Server if necessary. Initially, only S2 compression is in scope. 46 | 47 | Both block and individual message compression were initially explored. In order to benefit from repetition across individual messages (particularly where the data is structured, i.e. in JSON format), compression at the block level provides significantly better compression ratios over compressing individual messages separately. 48 | 49 | The compression algorithm can be updated after the stream has been created. Newly minted blocks will use the newly selected compression algorithm, but this will not result in existing blocks being proactively compressed or decompressed. An existing block will only be compressed or decompressed according to the newly configured algorithm when it is modified for another reason, i.e. during truncation or compaction. 50 | 51 | ## Consequences 52 | 53 | Compression requires extra system resources, therefore it is anticipated that a compressed stream may suffer some performance penalties compared to an uncompressed stream. 54 | -------------------------------------------------------------------------------- /adr/ADR-3.md: -------------------------------------------------------------------------------- 1 | # NATS Service Latency Distributed Tracing Interoperability 2 | 3 | 4 | | Metadata | Value | 5 | |----------|-----------------------| 6 | | Date | 2020-05-21 | 7 | | Author | @ripienaar | 8 | | Status | Approved | 9 | | Tags | observability, server | 10 | 11 | ## Context 12 | 13 | The goal is to enable the NATS internal latencies to be exported to distributed tracing systems, here we see a small 14 | architecture using Traefik, a Go microservice and a NATS hosted service all being observed in Jaeger. 15 | 16 | ![Jaeger](images/0003-jaeger-trace.png) 17 | 18 | The lowest 3 spans were created from a NATS latency Advisory. 19 | 20 | These traces can be ingested by many other commercial systems like Data Dog and Honeycomb where they can augment the 21 | existing operations tooling in use by our users. Additionally Grafana 7 supports Jaeger and Zipkin today. 22 | 23 | Long term I think every server that handles a message should emit a unique trace so we can also get visibility into 24 | the internal flow of the NATS system and exactly which gateway connection has a delay - see our current HM issues - but 25 | ultimately I don't think we'll be doing that in the hot path of the server, though these traces are easy to handle async 26 | 27 | Meanwhile, this proposal will let us get very far with our current Latency Advisories. 28 | 29 | ## Configuring an export 30 | 31 | Today there are no standards for the HTTP headers that communicate span context downstream - with Trace Context being 32 | an emerging w3c standard. 33 | 34 | I suggest we support the Jaeger and Zipkin systems as well as Trace Context for long term standardisation efforts. 35 | 36 | Supporting these would mean we have to interpret the headers that are received in the request to determine if we should 37 | publish a latency advisory rather than the static `50%` configuration we have today. 38 | 39 | Today we have: 40 | 41 | ``` 42 | exports: [ 43 | { 44 | service: weather.service 45 | accounts: [WEB] 46 | latency: { 47 | sampling: 50% 48 | subject: weather.latency 49 | } 50 | } 51 | ] 52 | ``` 53 | 54 | This enables sampling based `50%` of the service requests on this service. 55 | 56 | I propose we support the additional sampling value `headers` which will configure the server to 57 | interpret the headers as below to determine if a request should be sampled. 58 | 59 | ## Propagating headers 60 | 61 | The `io.nats.server.metric.v1.service_latency` advisory gets updated with an additional `headers` field. 62 | 63 | `headers` contains only the headers used for the sampling decision. 64 | 65 | ```json 66 | { 67 | "type": "io.nats.server.metric.v1.service_latency", 68 | "id": "YBxAhpUFfs1rPGo323WcmQ", 69 | "timestamp": "2020-05-21T08:06:29.4981587Z", 70 | "status": 200, 71 | "headers": { 72 | "Uber-Trace-Id": ["09931e3444de7c99:50ed16db42b98999:0:1"] 73 | }, 74 | "requestor": { 75 | "acc": "WEB", 76 | "rtt": 1107500, 77 | "start": "2020-05-21T08:06:20.2391509Z", 78 | "user": "backend", 79 | "lang": "go", 80 | "ver": "1.10.0", 81 | "ip": "172.22.0.7", 82 | "cid": 6, 83 | "server": "nats2" 84 | }, 85 | "responder": { 86 | "acc": "WEATHER", 87 | "rtt": 1389100, 88 | "start": "2020-05-21T08:06:20.218714Z", 89 | "user": "weather", 90 | "lang": "go", 91 | "ver": "1.10.0", 92 | "ip": "172.22.0.6", 93 | "cid": 6, 94 | "server": "nats1" 95 | }, 96 | "start": "2020-05-21T08:06:29.4917253Z", 97 | "service": 3363500, 98 | "system": 551200, 99 | "total": 6411300 100 | } 101 | ``` 102 | 103 | ## Header Formats 104 | 105 | Numerous header formats are found in the wild, main ones are Zipkin and Jaeger and w3c `tracestate` being an emerging standard. 106 | 107 | Grafana supports Zipkin and Jaeger we should probably support at least those, but also Trace Context for future interop. 108 | 109 | ### Zipkin 110 | 111 | ``` 112 | X-B3-TraceId: 80f198ee56343ba864fe8b2a57d3eff7 113 | X-B3-ParentSpanId: 05e3ac9a4f6e3b90 114 | X-B3-SpanId: e457b5a2e4d86bd1 115 | X-B3-Sampled: 1 116 | ``` 117 | 118 | Also supports a single `b3` header like `b3={TraceId}-{SpanId}-{SamplingState}-{ParentSpanId}` or just `b3=0` 119 | 120 | [Source](https://github.com/openzipkin/b3-propagation) 121 | 122 | ### Jaeger 123 | 124 | ``` 125 | uber-trace-id: {trace-id}:{span-id}:{parent-span-id}:{flags} 126 | ``` 127 | 128 | Where flags are: 129 | 130 | * One byte bitmap, as two hex digits 131 | * Bit 1 (right-most, least significant, bit mask 0x01) is “sampled” flag 132 | * 1 means the trace is sampled and all downstream services are advised to respect that 133 | * 0 means the trace is not sampled and all downstream services are advised to respect that 134 | 135 | Also a number of keys like `uberctx-some-key: value` 136 | 137 | [Source](https://www.jaegertracing.io/docs/1.17/client-libraries/#tracespan-identity) 138 | 139 | ### Trace Context 140 | 141 | Supported by many vendors including things like New Relic 142 | 143 | ``` 144 | traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01 145 | tracestate: rojo=00f067aa0ba902b7,congo=t61rcWkgMzE 146 | ``` 147 | 148 | Here the `01` of `traceparent` means its sampled. 149 | 150 | [Source](https://www.w3.org/TR/trace-context/) 151 | 152 | ### OpenTelemetry 153 | 154 | Supports Trace Context 155 | 156 | [Source](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md) 157 | 158 | -------------------------------------------------------------------------------- /adr/ADR-28.md: -------------------------------------------------------------------------------- 1 | # JetStream RePublish 2 | 3 | | Metadata | Value | 4 | |----------|-------------------------| 5 | | Date | 2022-07-08 | 6 | | Author | @derekcollison, @tbeets | 7 | | Status | Implemented | 8 | | Tags | jetstream, server | 9 | 10 | ## Update History 11 | | Date | Author | Description | 12 | |------------|---------|----------------------------------------------------| 13 | | 2023-06-27 | @tbeets | Fix typo on JSON boolean in `headers_only` example | 14 | 15 | ## Context and Problem Statement 16 | 17 | In some use cases it is useful for a subscriber to monitor messages that have been ingested by a stream (captured to 18 | store) without incurring the overhead of defining and using a JS Consumer on the stream. 19 | 20 | Such use cases include (but are not limited to): 21 | 22 | * Lightweight stream publish monitors (such as a dashboard) that don't require the overhead of At-Least-Once delivery 23 | * No side-effect WorkQueue and Interest-Based stream publish monitoring 24 | * KV or Object Store update events as an alternative to watches, e.g. an option for cache invalidation 25 | 26 | ## Design 27 | 28 | If stream _RePublish_ option is configured, a stream will evaluate each published message (that it ingests) against 29 | a _RePublish Source_ subject filter. Upon match, the stream will re-publish the message (with special message headers 30 | as below) to a new _RePublish Destination_ subject derived through a subject transformation. 31 | 32 | > Re-publish occurs only after the original published message is ingested in the stream (with quorum for R>1 streams) and is 33 | _At-Most-Once_ QoS. 34 | 35 | ### RePublish Configuration Option 36 | 37 | The RePublish option "republish" consists of three configuration fields: 38 | 39 | | Field | Description | JSON | Required | Default | 40 | |--------------|---------------------------------------------|--------------|----------|---------| 41 | | Source | Published Subject-matching filter | src | N | \> | 42 | | Destination | RePublish Subject template | dest | Y | | 43 | | Headers Only | Whether to RePublish only headers (no body) | headers_only | N | false | 44 | 45 | The following validation rules for RePublish option apply: 46 | 47 | * A single token as `>` wildcard is allowed as the Source with meaning taken as any stream-ingested subject. 48 | * Destination MUST have at least 1 non-wildcard token 49 | * Destination MAY not match or subset the subject filter(s) of the stream 50 | * Source and Destination must otherwise comply with requirements specified in [ADR-30 Subject Transform](ADR-30.md). 51 | 52 | Here is an example of a stream configuration with the RePublish option specified: 53 | ```text 54 | { 55 | "name": "Stream1", 56 | "subjects": [ 57 | "one.>", 58 | "four.>" 59 | ], 60 | "republish": { 61 | "src": "one.>", 62 | "dest": "uno.>", 63 | "headers_only": false 64 | }, 65 | "retention": "limits", 66 | ... omitted ... 67 | } 68 | ``` 69 | In the configuration above, a published message at `one.foo.bar` will be ingested into `Stream1` as `one.foo.bar` and 70 | re-published as `uno.foo.bar`. Published messages at `four.foo.bar` will be ingested into `Stream1` but not re-published. 71 | 72 | > RePublish option configuration MAY be edited after stream creation. 73 | 74 | ### RePublish Transform 75 | 76 | RePublish Destination, taken together with RePublish Source, form a valid subject token transform rule. The resulting 77 | transform is applied to each ingested message (that matches Source configuration) to determine the the concrete 78 | RePublish Subject. 79 | 80 | See [ADR-30 Subject Transform](ADR-30.md) for 81 | description of subject transformation as used by RePublish. 82 | 83 | ### RePublish Headers 84 | 85 | Each RePublished Message will have the following message headers: 86 | 87 | | Header | Value Description | 88 | |--------------------|------------------------------------------------------------------------------------------------------------| 89 | | Nats-Stream | Stream name (in scope to stream's account) | 90 | | Nats-Subject | Message's original subject as ingested into stream | 91 | | Nats-Sequence | This message's stream sequence id | 92 | | Nats-Last-Sequence | The stream sequence id of the last message ingested to the same original subject (or 0 if none or deleted) | 93 | 94 | If headers-only is "true", also: 95 | 96 | | Header | Value Description | 97 | |---------------|-----------------------------------------| 98 | | Nats-Msg-Size | The size in bytes of the message's body | 99 | 100 | > Application-added headers in the original published message will be preserved in the re-published message. 101 | 102 | ### Loop Prevention 103 | 104 | Valid Destination configuration checks insures that re-published messages are not immediately ingested into the original 105 | stream (causing a loop). The scope of loop-detection is to the immediate stream only. 106 | 107 | > Caution: It is possible to create a loop condition between two streams sharing an overlap in republish destinations and subject filters 108 | > within a single account. -------------------------------------------------------------------------------- /adr/ADR-43.md: -------------------------------------------------------------------------------- 1 | # JetStream Per-Message TTL 2 | 3 | | Metadata | Value | 4 | |----------|---------------------------------| 5 | | Date | 2024-07-11 | 6 | | Author | @ripienaar | 7 | | Status | Implemented | 8 | | Tags | jetstream, client, server, 2.11 | 9 | 10 | ## Context and motivation 11 | 12 | Streams support a one-size-fits-all approach to message TTL based on the MaxAge setting. This causes any message in the Stream to expire at that age. 13 | 14 | There are numerous uses for a per-message version of this limit, some listed below: 15 | 16 | * KV tombstones are a problem in that they forever clog up the buckets with noise, these could have a TTL to make them expire once not useful anymore 17 | * Server-applied limits can result in tombstones with a short per message TTL so that consumers can be notified of limits being processed. Useful in KV watch scenarios being notified about TTL removals 18 | * A stream may have a general MaxAge but some messages may have infinite retention, think a schema or type hints in a KV bucket that is forever while general keys have TTLs 19 | 20 | Related issues [#3268](https://github.com/nats-io/nats-server/issues/3268) 21 | 22 | ## Per-Message TTL 23 | 24 | ### General Behavior 25 | 26 | We will allow a message to supply a TTL using a header called `Nats-TTL` followed by the duration as seconds or as a Go duration string like `1h`. 27 | 28 | The duration will be used by the server to calculate the deadline for removing the message based on its Stream timestamp and the stated duration. 29 | 30 | Setting the header `Nats-TTL` to `never` will result in a message that will never be expired. 31 | 32 | A TTL of zero will be ignored, any other unparsable value will result in a error reported in the Pub Ack and the message 33 | being discarded. 34 | 35 | When a message with the `Nats-TTL` header is published to a stream with the feature disabled the message will be rejected with an error. 36 | 37 | ## Limit Markers 38 | 39 | Several scenarios for server-created markers can be imagined, the most often requested one though is when MaxAge removes last value (ie. the current value) for a Key. 40 | 41 | In this case when the server removes a message and the message is the last in the subject it would place a message with a TTL matching the Stream configuration value. The following headers would be placed: 42 | 43 | ``` 44 | Nats-Marker-Reason: MaxAge 45 | Nats-TTL: 1 46 | ``` 47 | 48 | This marker will also be placed for a message removed by the `Nats-TTL` timer. 49 | 50 | This behaviour is off by default unless opted in on the `SubjectDeleteMarkerTTL` Stream Configuration. 51 | 52 | ### Delete API Call Marker 53 | 54 | > [!IMPORTANT] 55 | > This feature will come either later in 2.11.x series or in 2.12. 56 | 57 | When someone calls the delete message API of a stream the server will place a the following headers: 58 | 59 | ``` 60 | Nats-Marker-Reason: Remove 61 | Nats-TTL: 1 62 | ``` 63 | 64 | ### Purge API Call Marker 65 | 66 | > [!IMPORTANT] 67 | > This feature will come either later in 2.11.x series or in 2.12. 68 | 69 | 70 | When someone calls the purge subject API of a stream the server will place a the following headers: 71 | 72 | ``` 73 | Nats-Marker-Reason: Purge 74 | Nats-TTL: 1 75 | ``` 76 | 77 | ### Sources and Mirrors 78 | 79 | Sources and Mirrors will always accept and store messages with `Nats-TTL` header present, even if the `AllowMsgTTL` setting is disabled in the Stream settings. 80 | 81 | If the `AllowMsgTTL` setting is enabled then processing continues as outlined in the General Behavior section with messages removed after the TTL. With the setting disabled the messages are just stored. 82 | 83 | Sources may set the `SubjectDeleteMarkerTTL` option and processing of messages with the `Nats-TTL` will place tombstones, but, Mirrors may not enable `SubjectDeleteMarkerTTL` since it would insert new messages into the Stream it might make it impossible to match sequences from the Mirrored Stream. 84 | 85 | ## Stream Configuration 86 | 87 | Weather or not a stream support this behavior should be a configuration opt-in. We want clients to definitely know when this is supported which the opt-in approach with a boolean on the configuration would make clear. 88 | 89 | We have to assume someone will want to create a replication topology where at some point in the topology these tombstone type messages are retained for an audit trail. So a Stream with this feature enabled can replicate to one with it disabled and all the messages that would have been TTLed will be retained. 90 | 91 | ```golang 92 | type StreamConfig struct { 93 | // AllowMsgTTL allows header initiated per-message TTLs 94 | AllowMsgTTL bool `json:"allow_msg_ttl"` 95 | 96 | // Enables and sets a duration for adding server markers for delete, purge and max age limits 97 | SubjectDeleteMarkerTTL time.Duration `json:"subject_delete_marker_ttl,omitempty"` 98 | } 99 | ``` 100 | 101 | Restrictions: 102 | 103 | * The `AllowMsgTTL` field can be enabled on existing streams but not disabled. 104 | * The `AllowMsgTTL` and `SubjectDeleteMarkerTTL` has a minimum value of 1 second. 105 | * The `SubjectDeleteMarkerTTL` setting may not be set on a Mirror Stream. 106 | * When `AllowMsgTTL` or `SubjectDeleteMarkerTTL` are set the Stream should require API level `1`. 107 | * `AllowRollup` must be `true`, stream update and create should set this unless pedantic mode is enabled. 108 | * `DenyPurge` must be `false`, stream update and create should set this unless pedantic mode is enabled. 109 | * Unless `MaxMsgsPer` equals 1 the server should treat `SubjectDeleteMarkerTTL` as the minimum for `Nats-TTL` but not reject messages that do not satisfy that. This might be changed in 2.12 depending on some internal implementation fixes in the server. 110 | 111 | -------------------------------------------------------------------------------- /adr/ADR-19.md: -------------------------------------------------------------------------------- 1 | # API prefixes for materialized JetStream views 2 | 3 | |Metadata|Value| 4 | |--------|-----| 5 | |Date |2021-11-18| 6 | |Author |@mhanel| 7 | |Status |Partially Implemented| 8 | |Tags |jetstream, client, kv, objectstore | 9 | 10 | ## Context 11 | 12 | This document describes a design on how to support API prefixes for materialized JS views. 13 | 14 | API prefixes allow the client library to disambiguate access to independent JetStreams that run either in different domains or different accounts. 15 | By specifying the prefix in the API, a client program can essentially pick which one it wants to communicate with. 16 | This mechanism needs to be supported for materialized JS views as well. 17 | 18 | ## Overview 19 | 20 | Each JetStream only listens to default API subjects with the prefix `$JS.API`. 21 | Thus, when the client uses `$JS.API`, it communicates with the JetStream, local to its account and domain. 22 | 23 | To avoid traffic going to some other JetStream the following mechanisms are in place: 24 | 1. Account: Since the API has to be imported with an altered prefix, the request will not cross account boundaries. 25 | In a JetStream enabled account, an import without a prefix set will result in an error as JetStream is imported as well and we error on import overlaps. 26 | 2. Domain: On leaf node connections, the nats server adds in denies for `$JS.API.>`. 27 | 28 | When the client library sets an API prefix, all API subjects, the client publishes and subscribes to start with that instead of `$JS.API`. 29 | As a result the API traffic will not end up in the local JetStream as that alway only ever subscribes to `$JS.API....` 30 | 31 | As messages and subscriptions cross boundaries the following happens: 32 | 1. Accounts: When crossing the import, the API prefix is stripped and replaced with `$JS.API`. 33 | 2. Domain: When crossing the leaf node connection between domains, an automatically inserted mapping strips the API prefix and replaces it with `$JS.API`. 34 | 35 | This JetStream disambiguation mechanism needs to be added to materialized views such as KV and object store as well. 36 | Specifically, we need to tag along on the same API prefix. 37 | Setting different values to reach the same JetStream is a non starter. 38 | 39 | ## Design 40 | 41 | The first token of any API or materialized view is considered the default API prefix and will mean have local semantics. 42 | Thus, for the concrete views we treat the tokens `$KV` and `$OBJ` as the respective default API prefixes. 43 | For publishes and subscribes the API just replaces the first token with the specified (non default) prefix. 44 | 45 | To share API access across accounts this is sufficient as account export/import takes care of the rest. 46 | To access an API in the same account but different domain, the `nats-server` maintaining a leaf node connection will add in the appropriate mappings from domain specific API to local API and deny local API traffic. 47 | 48 | ## KV Example 49 | 50 | Assume the API prefix for JetStream has been set to `JS.from-acc1`. 51 | For JetStream specific API calls, the local API prefix `$JS.API` has to be replaced with `JS.from-acc1`. 52 | 53 | Because the JetStream specified API prefix differs from `$JS.API`, the KV API uses the same prefix as is specififed for JetStream API. 54 | 55 | For the KV API we prefix `$KV` with `JS.from-acc1`, resulting in `JS.from-acc1.$KV`. 56 | Thus, in order to put `key` in the bin `bin`, we send to `JS.from-acc1.$KV.bin.key` 57 | 58 | When crossing the account boundaries, this is then translated back to `$KV.bin.key`. 59 | Thus, the underlying stream still needs to be created with a subscription to `$KV.bin.key`. 60 | 61 | Domains are just a special case of API prefixes and will work the same way. 62 | The API prefix `$JS..API` will lead to `$JS..API.$KV.bin.key`. 63 | As the leaf node connection into the domain is crossed, the inserted mapping will changes the subject back to `$KV.bin.key`. 64 | 65 | ## Consequences 66 | 67 | The proposed change is backwards compatible with materialized views already in use. 68 | 69 | I suggested prefixing so we can have one prefix across different APIs. 70 | To avoid accidental API overlaps going forward, the implication for JetStream is to NOT start the first token after the API prefix with `$`. 71 | 72 | Specifically, JetStream will never expose any functionality under `$JS.API.$KV.>`. 73 | The version with an API prefix would look as follows `JS.from-acc1.$KV`, which will clash with the same subject, used by kv. 74 | This is a side effect of JetStream having a two token API prefix and the materialized views using a single token. 75 | 76 | This problem can be avoided by unifying the API name spaces to alway be two tokens with the second token being `API`, resulting in `$JS.API`, `$KV.API` and `$OBJ.API`. 77 | This however will not be backwards compatible. 78 | 79 | ## Testing 80 | 81 | Here is a server config to test your changes. 82 | * The JetStream prefix to use is `fromA` 83 | * The inbox prefix to use is `forI` 84 | 85 | ``` 86 | jetstream: enabled 87 | accounts: { 88 | A: { 89 | users: [ {user: a, password: a} ] 90 | jetstream: enabled 91 | exports: [ 92 | {service: '$JS.API.>' } 93 | {service: '$KV.>'} 94 | {stream: 'forI.>' } 95 | ] 96 | }, 97 | I: { 98 | users: [ {user: i, password: i} ] 99 | imports: [ 100 | {service: {account: A, subject: '$JS.API.>'}, to: 'fromA.>' } 101 | {service: {account: A, subject: '$KV.>'}, to: 'fromA.$KV.>' } 102 | {stream: { subject: 'forI.>', account: 'A' } } 103 | ] 104 | } 105 | } 106 | ``` 107 | 108 | Test JetStream connected to account `I` talking to JetStream in account `A`: `nats account info -s "nats://i:i@localhost:4222" --js-api-prefix fromA` 109 | 110 | KV publishes and subscribes need to support the prefix as well. 111 | Absent an actual implementation this is simulated with pub/sub. 112 | Your implementation needs to be able to connect to account I and access map ojbjects in Account A. 113 | 114 | ``` 115 | nats -s "nats://a:a@localhost:4222" sub '$KV.map.>' & 116 | sleep 1 117 | nats -s "nats://i:i@localhost:4222" pub 'fromA.$KV.map.put' "hello world" 118 | ``` 119 | -------------------------------------------------------------------------------- /adr/ADR-47.md: -------------------------------------------------------------------------------- 1 | # Request Many 2 | 3 | | Metadata | Value | 4 | |----------|----------------------------| 5 | | Date | 2024-09-26 | 6 | | Author | @aricart, @scottf, @Jarema | 7 | | Status | Partially Implemented | 8 | | Tags | client, spec, orbit | 9 | 10 | | Revision | Date | Author | Info | 11 | |----------|------------|-----------|-------------------------| 12 | | 1 | 2024-09-26 | @scottf | Document Initial Design | 13 | 14 | ## Problem Statement 15 | Have the client support receiving multiple replies from a single request, instead of limiting the client to the first reply, 16 | and support patterns like scatter-gather and sentinel. 17 | 18 | ## Basic Design 19 | 20 | The user can provide some configuration controlling how and how long to wait for messages. 21 | The client handles the requests and subscriptions and provides the messages to the user. 22 | 23 | * The client doesn't assume success or failure - only that it might receive messages. 24 | * The various configuration options are there to manage and short circuit the length of the wait, 25 | and provide the user the ability to directly stop the processing. 26 | * Request Many is not a recoverable operation, but it could be wrapped in a retry pattern. 27 | * The client should communicate status whenever possible, for instance if it gets a 503 No Responders 28 | 29 | ## Config 30 | 31 | ### Total timeout 32 | 33 | The maximum amount of time to wait for responses. When the time is expired, the process is complete. 34 | The wait for the first message is always made with the total timeout since at least one message must come in within the total time. 35 | 36 | * Always used 37 | * Defaults to the connection or system request timeout. 38 | 39 | ### Stall timer 40 | 41 | The amount time to wait for messages other than the first (subsequent waits). 42 | Considered "stalled" if this timeout is reached, indicating the request is complete. 43 | 44 | * Optional 45 | * Less than 1 or greater than or equal to the total timeout behaves the same as if not supplied. 46 | * Defaults to not supplied. 47 | * When supplied, subsequent waits are the lesser of the stall time or the calculated remaining time. 48 | This allows the total timeout to be honored and for the stall to not extend the loop past the total timeout. 49 | 50 | ### Max messages 51 | 52 | The maximum number of messages to wait for. 53 | * Optional 54 | * If this number of messages is received, the request is complete. 55 | * If this number is supplied and total timeout is not set, total timeout defaults to the connection or system timeout. 56 | 57 | ### Sentinel 58 | 59 | While processing the messages, the user should have the ability to indicate that it no longer wants to receive any more messages. 60 | * Optional 61 | * Language specific implementation 62 | * If sentinel is supplied and total timeout is not set, total timeout defaults to the connection or system timeout. 63 | 64 | ## Notes 65 | 66 | ### Message Handling 67 | 68 | Each client must determine how to give messages to the user. 69 | * They could all be collected and given at once. 70 | * They could be put in an iterator, queue, channel, etc. 71 | * A callback could be made. 72 | 73 | ### End of Data 74 | 75 | The developer should notify the user when the request has stopped processing and the receiving mechanism is not fixed like a list 76 | or iterator that termination is obvious. A queue or a callback for instance, should get a termination message. 77 | Implementation is language specific based on control flow. 78 | 79 | ### Status Messages / Server Errors 80 | 81 | If a status (like a 503) or an error comes in place of a user message, this is terminal. 82 | This is probably useful information for the user and can be conveyed as part of the end of data. 83 | 84 | #### Callback timing 85 | 86 | If callbacks are made in a blocking fashion, 87 | the client must account for the time it takes for the user to process the message 88 | and not consider that time against the timeouts. 89 | 90 | ### Sentinel 91 | 92 | If the client supports a sentinel with a callback/predicate that accepts the message and returns a boolean, 93 | a return of true would mean continue to process and false would mean stop processing. 94 | 95 | If possible, the client should support the "standard sentinel", which is a message with a null/nil or empty payload. 96 | 97 | ### Cancelling 98 | 99 | A client can offer other ways for the user to be able to cancel the request. This is another pathway besides sentinel 100 | allowing that the dev can cancel the entire request-many arbitrarily. 101 | 102 | ## Disconnection 103 | 104 | It's possible that there is a connectivity issue that prevents messages from reaching the requester, 105 | It might be difficult to differentiate that timeout from a total or stall timeout. 106 | If possible to know the difference, this could be conveyed as part of the end of data. 107 | 108 | ## Strategies 109 | It's acceptable to make "strategies" via enum / api / helpers / builders / whatever. 110 | Strategies are just pre-canned configurations, for example: 111 | 112 | **Timeout or Wait** - this is the default strategy where only the total timeout is used. 113 | 114 | **Stall** - the stall defaults to the lessor of 1/10th of the total wait time (if provided) or the default connection timeout. 115 | 116 | **Max Responses** - accepts a max response number and uses the default timeout. 117 | 118 | ### Subscription Management 119 | Since the client is in charge of the subscription, it should always unsubscribe upon completion of the request handling instead of leaving it up to the server to time it out. 120 | 121 | #### Max Responses Optimization 122 | On requests that specify max responses, and when not using mux inboxes, the client can unsubscribe with a count immediately after subscribing. 123 | Theoretically this unsub could be processed after a reply has come in and out of the server, so you still must check the count manually. 124 | 125 | #### Mux Inbox 126 | If possible, the implementation can offer the use of the mux inbox. 127 | Consider that the implementation of managing the subscription will differ from a non-mux inbox, 128 | for instance not closing the subscription and not implementing a max response optimization. 129 | -------------------------------------------------------------------------------- /adr/ADR-7.md: -------------------------------------------------------------------------------- 1 | # NATS Server Error Codes 2 | 3 | |Metadata|Value| 4 | |--------|-----| 5 | |Date |2021-05-12| 6 | |Author |@ripienaar| 7 | |Status |Partially Implemented| 8 | |Tags |server, client, jetstream| 9 | 10 | ## Status 11 | 12 | Partially Implemented in [#1811](https://github.com/nats-io/nats-server/issues/1811) and [#2409](https://github.com/nats-io/nats-server/pull/2409) 13 | 14 | The current focus is JetStream APIs, we will as a followup do a refactor and generalization and move onto other 15 | areas of the server. 16 | 17 | ## Context 18 | 19 | When a developer performs a Consumer Info API request she might get a 404 error, there is no way to know if this is 20 | a 404 due to the Stream not existing or the Consumer not existing. The only way is to parse the returned error description 21 | like `consumer not found`. Further several other error situations might arise which due to our code design would be surfaced 22 | as a 404 when in fact it's more like a 5xx error - I/O errors and such. 23 | 24 | If users are parsing our error strings it means our error text form part of the Public API, we can never improve errors, 25 | fix spelling errors or translate errors into other languages. 26 | 27 | This ADR describes an additional `error_code` that provides deeper context into the underlying cause of the 404. 28 | 29 | ## Design 30 | 31 | We will adopt a numbering system for our errors where every error has a unique number within a range that indicates the 32 | subsystem it belongs to. 33 | 34 | |Range|Description| 35 | |-----|-----------| 36 | |1xxxx|JetStream related errors| 37 | |2xxxx|MQTT related errors| 38 | 39 | The JetStream API error will be adjusted like this initially with later work turning this into a more generic error 40 | usable in other parts of the NATS Server code base. 41 | 42 | ```go 43 | // ApiError is included in all responses if there was an error. 44 | type ApiError struct { 45 | Code int `json:"code"` 46 | ErrCode int `json:"err_code,omitempty"` 47 | Description string `json:"description,omitempty"` 48 | URL string `json:"-"` 49 | Help string `json:"-"` 50 | } 51 | ``` 52 | 53 | Here the `code` and `error_code` is what we'll consider part of the Public API with `description` being specifically 54 | out of scope for SemVer protection and changes to these will not be considered a breaking change. 55 | 56 | The `ApiError` type will implement `error` and whenever it will be logged will append the code to the log line, for example: 57 | 58 | ``` 59 | stream not found (10059) 60 | ``` 61 | 62 | The `nats` CLI will have a lookup system like `nats error 10059` that will show details of this error including help, 63 | urls and such. It will also assist in listing and searching errors. The same could be surfaced later in documentation 64 | and other areas. 65 | 66 | ## Using in code 67 | 68 | ### Raising an error 69 | 70 | Here we raise a `stream not found` error without providing any additional context to the user, the constant is 71 | `JSStreamNotFoundErr` from which you can guess it takes no printf style interpolations vs one that does which would 72 | end in `...ErrF`: 73 | 74 | The go doc for this function would also include the content of the error to assist via intellisense in your IDE. 75 | 76 | ```go 77 | err = doThing() 78 | if err != nil { 79 | return NewJSStreamNotFoundError() 80 | } 81 | ``` 82 | 83 | Some errors require string interpolation of tokens, eg. the `JSStreamRestoreErrF` has the body `restore failed: {err}`, 84 | the `NewJSStreamRestoreError(err error)`will use the arguments to in the tokens, the argument name matches the token. 85 | Some tokens have specific type meaning for example the token `{err}` will always expect a `error` type and `{seq}` always 86 | `uint64` to help a bit with sanity checks at compile time. 87 | 88 | Note these errors that have tokens are new instances of the ApiError so normal compare of `err == ApiErrors[x]` will fail, 89 | the `IsNatsError()` helper should generally always be used to compare errors. 90 | 91 | ```go 92 | err = doRestore() 93 | if err != nil { 94 | return NewJSStreamRestoreError(err) 95 | } 96 | ``` 97 | 98 | If we had to handle an error that may be an `ApiError` or a traditional go error we can use the `Unless` ErrorOption. 99 | This example will look at the result from `lookupConsumer()`, if it's an `ApiError` that error will be set else `JSConsumerNotFoundErr` be 100 | returned with the error message replaced in the `{err}` token. 101 | 102 | Essentially the `lookupConsumer()` would return a `JSStreamNotFoundErr` if the stream does not exist else a `JSConsumerNotFoundErr` 103 | or go error on I/O failure for example. 104 | 105 | ```go 106 | var resp = JSApiConsumerCreateResponse{ApiResponse: ApiResponse{Type: JSApiStreamCreateResponseType}} 107 | 108 | _, err = lookupConsumer(stream, consumer) 109 | if err != nil { 110 | resp.Error = NewJSConsumerNotFoundError(err, Unless(err)) 111 | } 112 | ``` 113 | 114 | ### Testing Errors 115 | 116 | Should you need to determine if a error is of a specific kind (error code) this can be done using the `IsNatsErr()` function: 117 | 118 | ```go 119 | err = doThing() 120 | if IsNatsErr(err, JSStreamNotFoundErr, JSConsumerNotFoundErr) { 121 | // create the stream and consumer 122 | } else if err !=nil{ 123 | // other critical failure 124 | } 125 | ``` 126 | 127 | ## Maintaining the errors 128 | 129 | The file `server/errors.json` holds the data used to generate the error constants, lists etc. This is JSON versions of 130 | `server.ErrorsData`. 131 | 132 | ```json 133 | [ 134 | { 135 | "constant": "JSClusterPeerNotMemberErr", 136 | "code": 400, 137 | "error_code": 10040, 138 | "description": "peer not a member" 139 | }, 140 | { 141 | "constant": "JSNotEnabledErr", 142 | "code": 503, 143 | "error_code": 10039, 144 | "description": "JetStream not enabled for account", 145 | "help": "This error indicates that JetStream is not enabled either at a global level or at global and account level", 146 | "url": "https://docs.nats.io/jetstream" 147 | } 148 | ] 149 | ``` 150 | 151 | The `nats` CLI allow you to edit, add and view these files using the `nats errors` command, use the `--errors` flag to 152 | view your local file during development. 153 | 154 | After editing this file run `go generate` in the top of the `nats-server` repo, and it will update the needed files. Check 155 | in the result. 156 | 157 | When run this will verify that the `error_code` and `constant` is unique in each error 158 | -------------------------------------------------------------------------------- /adr/ADR-54.md: -------------------------------------------------------------------------------- 1 | # KV Codecs 2 | 3 | | Metadata | Value | 4 | |----------|------------------------------------| 5 | | Date | 2025-08-06 | 6 | | Author | @piotrpio | 7 | | Status | Proposed | 8 | | Tags | jetstream, client, spec, orbit, kv | 9 | | Updates | ADR-8 | 10 | 11 | | Revision | Date | Author | Info | 12 | |----------|------------|-----------|----------------| 13 | | 1 | 2025-08-06 | @piotrpio | Initial design | 14 | 15 | ## Context and Problem Statement 16 | 17 | JetStream Key-Value stores require flexible data transformation capabilities to handle various encoding scenarios such as character escaping, path notation conversion, encryption, and custom transformations. Currently, these transformations must be implemented at the application level, leading to inconsistent implementations and increased complexity. 18 | 19 | A standardized codec system would provide transparent encoding and decoding of keys and values while maintaining full compatibility with the existing KV API, enabling users to handle special characters, implement security features, and perform custom transformations seamlessly. 20 | 21 | ## Context 22 | 23 | The JetStream Key-Value store uses NATS subjects as keys and message payloads as values. Several use cases require transformation of these elements: 24 | 25 | 1. **Character Escaping**: Keys containing special characters (spaces, dots, etc.) that are invalid in NATS subjects 26 | 2. **Path Translation**: Converting between different path notation styles (e.g., filesystem paths to NATS subjects) 27 | 3. **Security**: End-to-end encryption 28 | 4. **Custom Transformations**: Application-specific encoding requirements 29 | 30 | ## Design 31 | 32 | ### Core Interfaces 33 | 34 | The codec system is based on two separate interfaces for transforming keys and values: 35 | 36 | ```go 37 | // KeyCodec transforms keys before storage and after retrieval 38 | type KeyCodec interface { 39 | Encode(key string) string 40 | Decode(encoded string) string 41 | } 42 | 43 | // ValueCodec transforms values before storage and after retrieval 44 | type ValueCodec interface { 45 | Encode(value []byte) []byte 46 | Decode(encoded []byte) []byte 47 | } 48 | ``` 49 | 50 | ### Creating a Codec-Enabled KV Bucket 51 | 52 | A codec-enabled KV bucket wraps the standard KV interface. It should be possible to use codecs for both keys and values independently, allowing for flexible transformations. 53 | 54 | ```go 55 | type CodecKV struct { 56 | kv jetstream.KeyValue 57 | keyCodec KeyCodec 58 | valueCodec ValueCodec 59 | } 60 | 61 | // Constructor functions 62 | func New(kv jetstream.KeyValue, keyCodec KeyCodec, valueCodec ValueCodec) jetstream.KeyValue 63 | 64 | // NewForKey creates a KV bucket with a key codec only, using no-op value codec. 65 | func NewForKey(kv jetstream.KeyValue, keyCodec KeyCodec) jetstream.KeyValue 66 | 67 | // NewForValue creates a KV bucket with a value codec only, using no-op key codec. 68 | func NewForValue(kv jetstream.KeyValue, valueCodec ValueCodec) jetstream.KeyValue 69 | ``` 70 | 71 | ### Transparent Operation 72 | 73 | All KV operations work transparently with codecs: 74 | 75 | ```go 76 | // Standard KV operations 77 | err := codecKV.Put(ctx, key, value) 78 | entry, err := codecKV.Get(ctx, key) 79 | watcher, err := codecKV.Watch(ctx, pattern) 80 | keys, err := codecKV.Keys(ctx) 81 | 82 | // History and other operations 83 | history, err := codecKV.History(ctx, key) 84 | err := codecKV.Delete(ctx, key) 85 | err := codecKV.Purge(ctx, key) 86 | ``` 87 | 88 | ### Custom Codec Implementation 89 | 90 | Users can implement custom codecs for specific requirements: 91 | 92 | ```go 93 | type AESCodec struct { 94 | cipher cipher.Block 95 | } 96 | 97 | func (c *AESCodec) Encode(value []byte) []byte { 98 | // Implement AES encryption 99 | return encrypted 100 | } 101 | 102 | func (c *AESCodec) Decode(value []byte) []byte { 103 | // Implement AES decryption 104 | return decrypted 105 | } 106 | 107 | // Usage 108 | aesCodec := &AESCodec{cipher: aesCipher} 109 | codecKV := kvcodec.NewForValue(kv, aesCodec) 110 | ``` 111 | 112 | ### Watch and Wildcard Support 113 | 114 | Key codecs should optionally support wildcard patterns for watching and listing keys. The `KeyCodec` interface can be extended to handle wildcard patterns, allowing users to specify how wildcards should be encoded and decoded. 115 | 116 | Special handling of wildcards should be optional when implementing custom codecs, in a language idiomatic way. 117 | 118 | ```go 119 | type FilterableKeyCodec interface { 120 | KeyCodec 121 | EncodeFilter(filter string) (string, error) 122 | } 123 | 124 | // Example implementation 125 | func (c *CustomCodec) EncodeFilter(filter string) (string, error) { 126 | // Handle wildcard patterns specially to preserve filtering 127 | return encodePreservingWildcards(filter), nil 128 | } 129 | ``` 130 | 131 | ### Codec Chaining 132 | 133 | It should be possible to chain multiple codecs together, allowing for complex transformations. Keys or values should be processed through each codec in sequence and decoded in reverse order. 134 | 135 | ```g 136 | keyChain, _ := kvcodec.NewKeyChainCodec(pathCodec, base64Codec) 137 | valueChain, _ := kvcodec.NewValueChainCodec(aesCodec, base64Codec) 138 | codecKV := kvcodec.New(kv, keyChain, valueChain) 139 | ``` 140 | 141 | ### Built-in Codecs 142 | 143 | #### 1. NoOpCodec 144 | 145 | Passes data through unchanged, useful for selective encoding: 146 | 147 | ```go 148 | type NoOpCodec struct{} 149 | 150 | func (c NoOpCodec) Encode(s string) string { return s } 151 | func (c NoOpCodec) Decode(s string) string { return s } 152 | ``` 153 | 154 | #### 2. Base64Codec 155 | 156 | Encodes keys/values using url-encoded base64: 157 | 158 | ```go 159 | type Base64Codec struct{} 160 | 161 | // Example usage: 162 | codec := kvcodec.Base64Codec() 163 | codecKV := kvcodec.New(kv, codec, kvcodec.NoOpCodec()) 164 | 165 | // "Acme Inc.contact" becomes "QWNtZSBJbmMuY29udGFjdA==" 166 | codecKV.Put(ctx, "Acme Inc.contact", []byte("info@acme.com")) 167 | ``` 168 | 169 | #### 3. PathCodec 170 | 171 | Translates between path-style and NATS-style keys: 172 | 173 | ```go 174 | type PathCodec struct { 175 | separator string 176 | } 177 | 178 | // Example: converts "user/profile/settings" to "user.profile.settings" 179 | codec := kvcodec.NewPathCodec("/") 180 | codecKV := kvcodec.NewForKey(kv, codec) 181 | ``` 182 | 183 | As encoding leading and trailing slashes cannot be preserved directly as leading and trailing dots, the codec should handle these cases by encoding leading slashes as `_root_` and trimming trailing slashes. 184 | -------------------------------------------------------------------------------- /adr/ADR-14.md: -------------------------------------------------------------------------------- 1 | # JWT library free jwt user generation 2 | 3 | |Metadata|Value| 4 | |--------|-----| 5 | |Date |2021-07-20| 6 | |Author |@mhanel, @aricart | 7 | |Status |Approved| 8 | |Tags |client, security| 9 | 10 | ## Context and Problem Statement 11 | 12 | Users writing programs to issue user jwt, ask if this can be done in **languages other than go**. 13 | We assume that these programs stamp out a set number of identical user profiles. 14 | 15 | With the addition of scoped signing keys, limits and permissions are **attached to the signing key instead of the user**. 16 | As a consequence, user jwt generating programs, do not need the data model the jwt library provides. 17 | The use of scoped signing keys is highly encouraged as a compromised key can only be used to issue certain user, NOT user with arbitrary permissions. 18 | 19 | Thus, a simple utility function, added to nkey supporting client libraries, can replace the need for the go only jwt library, for most use cases. 20 | 21 | > This can be a series of standalone repositories as well. 22 | > Because this is a single and small utility function, adding it to suitable client libraries seems easiest for usage as it meets the users where they are (similar to nkey functions in some languages). 23 | > Furthermore, this may not be needed for every client library. 24 | > Initially we'd focus on languages widely used by corporate user where jwt generating programs are to be expected. 25 | > Those would be: `go`, `java`, `csharp` 26 | 27 | ## References 28 | 29 | This is going to be a much simplified version of this [c# example](https://docs.nats.io/developing-with-nats/tutorials/jwt#stamping-jwt-in-languages-other-than-go). 30 | Instead of the [go jwt library](https://github.com/nats-io/jwt) with the full data model, `sprintf` is used with very few values. 31 | Any library implementing this will require basic `nkey` functions to `sign` and `generate` new nkeys. 32 | `sign` will exist in every library supporting nkeys. 33 | Generating new nkeys is a matter of base64url random bytes, base64url encoding and decorating them. 34 | 35 | ## Design 36 | 37 | The proposed function is supposed to only set values that are not covered by the scoped signing key. 38 | These values are `name`, `expiration`, `account id`, `tags`. 39 | The `user nkey` can be generated by the signer or a requestor can provide it so the signer is unaware of the private nkey portion. 40 | So that the user can generate nkeys when not provided, the functionality to generate a user nkey needs to be public as well. 41 | 42 | Proposed name and function signature: 43 | ``` 44 | /** 45 | * signingKey, is a mandatory account nkey pair to sign the generated jwt. 46 | * accountId, is a mandatory public account nkey. Will return error when not set or not account nkey. 47 | * publicUserKey, is a mandatory public user nkey. Will return error when not set or not user nkey. 48 | * name, optional human readable name. When absent, default to publicUserKey. 49 | * expiration, optional but recommended duration, when the generated jwt needs to expire. If not set, JWT will not expire. 50 | * tags, optional list of tags to be included in the JWT. 51 | * 52 | * Returns: 53 | * error, when issues arose. 54 | * string, resulting jwt. 55 | **/ 56 | IssueUserJWT(signingKey nkey, accountId string, publicUserKey string, name string, expiration time.Duration, tags []string) (error, string) 57 | ``` 58 | 59 | Static header json: 60 | ```go 61 | {"typ":"JWT","alg":"ed25519-nkey"} 62 | ``` 63 | 64 | Commented body (//) json output that is subsequently turned into a jwt (treat not commented values as static): 65 | ```go 66 | { 67 | // expiration (when specified) time in unix seconds, derived from provided duration. 68 | "exp": 1632077055, 69 | // issue time in unix seconds. Will always need to be set. 70 | "iat": 1626720255, 71 | // public nkey portion of signingKey 72 | "iss": "AACYICOAQMQ72EHT35R7LV6VFWMIVWFKWFE5P2JJ2TT674EO7DJTUHMM", 73 | // unique implementation dependent value 74 | // our jwt library uses base32 encoded sha256 hash of this json with "jti":"" 75 | "jti": "FRQCL7TPAIL6KHKPPPJS2YOHTANKNHTPLTX7STGPTGYZVGTOH2LQ", 76 | // name provided or value of publicUserKey 77 | "name": "USER_NAME", 78 | "nats": { 79 | // value of accountId 80 | "issuer_account": "ADECCNBUEBWZ727OMBFSN7OMK2FPYRM52TJS25TFQWYS76NPOJBN3KU4", 81 | // tags, omit fully when not provided 82 | "tags": [ 83 | "provided_tag1", 84 | "provided_tag2" 85 | ], 86 | "type": "user", 87 | "version": 2 88 | }, 89 | // public nkey portion of signingKey 90 | "sub": "UD44C3VDAEYG527W3VPY353B3C6LIWJNW77GJED7MM5WIPGRUEVPHRZ5" 91 | } 92 | ``` 93 | The `jti` can be a random value. If possibly, we should compute it the same way the jwt library does: 94 | 1) The body is first generated with `jti` set to `""`. 95 | 2) Then the `jti` value is computed as base32 encoded sha256 hash of the body. 96 | 3) Then the body is generated again, this time with the correct `jti`. 97 | 98 | Both the static header json as well as the body are then base64url encoded. 99 | Please note that base64url is slightly different than base64. 100 | Padding characters `=` are removed and `+` is changed to `-` and `/` is changed to `_`. 101 | Compute the signature using signingKey over the following string (The dot is included as per jwt specification): 102 | ```go 103 | base64url-header + "." + base64url-body 104 | ``` 105 | 106 | Encode the signature with base64url, concatenate and return the resulting token: 107 | ```go 108 | base64url-header + "." + base64url-body + "." + base64url-signature 109 | ``` 110 | 111 | ## Consequences 112 | 113 | Remove the need for the jwt library when programmatically generating user jwt using scoped signing keys. 114 | 115 | ## Scoped Signing Key Setup for Development 116 | 117 | These commands generate an account named ACC, generate and add a signing key to it, and promote the generated signing key to a scoped one with an arbitrary role name for easier referencing, issue the next commands: 118 | ```bash 119 | > nsc add account -n ACC 120 | > nsc edit account --name ACC --sk generate 121 | > nsc edit signing-key --account ACC --sk --role anyrolename 122 | ``` 123 | 124 | To add a user with values similar to the earlier json body and sign it with the scoped signing key and output the json, execute: 125 | ```bash 126 | > nsc add user --account ACC --name USER_NAME --tag PROVIDED_TAG1 --tag PROVIDED_TAG2 --expiry 2h --private-key anyrolename 127 | > nsc describe user --account ACC --name USER_NAME --json 128 | ``` 129 | 130 | An invocation of the utility function, generating a similar user would look like this: 131 | ```go 132 | IssueUserJWT(accSigningKey, "ADECCNBUEBWZ727OMBFSN7OMK2FPYRM52TJS25TFQWYS76NPOJBN3KU4", "UD44C3VDAEYG527W3VPY353B3C6LIWJNW77GJED7MM5WIPGRUEVPHRZ5", "USER_NAME", 2*time.Hour, []string{"PROVIDED_TAG1", "PROVIDED_TAG2}) 133 | ``` 134 | -------------------------------------------------------------------------------- /main.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "fmt" 5 | "os" 6 | "path" 7 | "slices" 8 | "sort" 9 | "strconv" 10 | "strings" 11 | "text/template" 12 | "time" 13 | 14 | "gitlab.com/golang-commonmark/markdown" 15 | ) 16 | 17 | type ADRMeta struct { 18 | Index int 19 | Authors []string 20 | Date time.Time 21 | Status string 22 | Tags []string 23 | Path string 24 | Updates []string 25 | } 26 | 27 | type ADR struct { 28 | Heading string 29 | Meta ADRMeta 30 | } 31 | 32 | var ( 33 | validStatus = []string{"Proposed", "Approved", "Partially Implemented", "Implemented", "Deprecated"} 34 | allowedHeaders = []string{"Date", "Author", "Status", "Tags", "Updates"} 35 | ) 36 | 37 | func parseCommaList(l string) []string { 38 | tags := strings.Split(l, ",") 39 | res := []string{} 40 | for _, t := range tags { 41 | res = append(res, strings.TrimSpace(t)) 42 | } 43 | return res 44 | } 45 | 46 | func parseADR(adrPath string) (*ADR, error) { 47 | body, err := os.ReadFile(adrPath) 48 | if err != nil { 49 | panic(err) 50 | } 51 | 52 | md := markdown.New() 53 | tokens := md.Parse(body) 54 | 55 | in1stHdr := false 56 | in1stTbl := false 57 | curHdrKey := "" 58 | 59 | adr := ADR{ 60 | Meta: ADRMeta{ 61 | Path: adrPath, 62 | }, 63 | } 64 | 65 | base := strings.TrimSuffix(path.Base(adrPath), path.Ext(adrPath)) 66 | parts := strings.Split(base, "-") 67 | if len(parts) != 2 { 68 | return nil, fmt.Errorf("invalid filename %s in %s", base, adrPath) 69 | } 70 | 71 | idx, err := strconv.Atoi(parts[1]) 72 | if err != nil { 73 | return nil, fmt.Errorf("invalid file sequence %s in %s", parts[1], adrPath) 74 | } 75 | 76 | adr.Meta.Index = idx 77 | 78 | var shouldStop bool 79 | 80 | for _, t := range tokens { 81 | switch tok := t.(type) { 82 | case *markdown.Inline: 83 | switch { 84 | case in1stHdr: 85 | adr.Heading = tok.Content 86 | 87 | case curHdrKey == "Date": 88 | t, err := time.Parse("2006-01-02", tok.Content) 89 | if err != nil { 90 | return nil, fmt.Errorf("invalid date format, not YYYY-MM-DD: %s", err) 91 | } 92 | 93 | adr.Meta.Date = t 94 | case curHdrKey == "Author": 95 | adr.Meta.Authors = parseCommaList(tok.Content) 96 | 97 | case curHdrKey == "Status": 98 | adr.Meta.Status = tok.Content 99 | 100 | case curHdrKey == "Tags": 101 | adr.Meta.Tags = parseCommaList(tok.Content) 102 | 103 | case curHdrKey == "Updates": 104 | adr.Meta.Updates = parseCommaList(tok.Content) 105 | 106 | case in1stTbl: 107 | if !slices.Contains(allowedHeaders, tok.Content) { 108 | return nil, fmt.Errorf("invalid header %s in %s", tok.Content, adrPath) 109 | } 110 | 111 | curHdrKey = tok.Content 112 | } 113 | 114 | case *markdown.TbodyOpen: 115 | in1stTbl = true 116 | 117 | case *markdown.HeadingOpen: 118 | if adr.Heading == "" { 119 | in1stHdr = true 120 | } 121 | 122 | case *markdown.TbodyClose: 123 | in1stTbl = false 124 | shouldStop = true 125 | 126 | case *markdown.HeadingClose: 127 | in1stHdr = false 128 | 129 | case *markdown.TableClose: 130 | in1stTbl = false 131 | 132 | case *markdown.TrClose: 133 | curHdrKey = "" 134 | } 135 | 136 | if shouldStop { 137 | break 138 | } 139 | } 140 | 141 | if adr.Meta.Index == 0 { 142 | return nil, fmt.Errorf("invalid ADR Index in %s", adr.Meta.Path) 143 | } 144 | if adr.Meta.Date.IsZero() { 145 | return nil, fmt.Errorf("date is required in %s", adr.Meta.Path) 146 | } 147 | if !isValidStatus(adr.Meta.Status) { 148 | return nil, fmt.Errorf("invalid status %q, must be one of: %s in %s", adr.Meta.Status, strings.Join(validStatus, ", "), adr.Meta.Path) 149 | } 150 | if len(adr.Meta.Authors) == 0 { 151 | return nil, fmt.Errorf("authors is required in %s", adr.Meta.Path) 152 | } 153 | if len(adr.Meta.Tags) == 0 { 154 | return nil, fmt.Errorf("tags is required in %s", adr.Meta.Path) 155 | } 156 | 157 | if len(adr.Meta.Updates) > 0 { 158 | list := []string{} 159 | for _, u := range adr.Meta.Updates { 160 | list = append(list, fmt.Sprintf("[%s](adr/%s.md)", u, u)) 161 | } 162 | adr.Heading = fmt.Sprintf("%s (updating %s)", adr.Heading, strings.Join(list, ", ")) 163 | } 164 | 165 | return &adr, nil 166 | } 167 | 168 | func isValidStatus(status string) bool { 169 | for _, s := range validStatus { 170 | if status == s { 171 | return true 172 | } 173 | } 174 | 175 | return false 176 | } 177 | 178 | func verifyUniqueIndexes(adrs []*ADR) error { 179 | indexes := map[int]string{} 180 | for _, a := range adrs { 181 | path, ok := indexes[a.Meta.Index] 182 | if ok { 183 | return fmt.Errorf("duplicate index %d, conflict between %s and %s", a.Meta.Index, a.Meta.Path, path) 184 | } 185 | indexes[a.Meta.Index] = a.Meta.Path 186 | } 187 | 188 | return nil 189 | } 190 | 191 | func renderIndexes(adrs []*ADR) error { 192 | tags := map[string]int{} 193 | for _, adr := range adrs { 194 | for _, tag := range adr.Meta.Tags { 195 | tags[tag] = 1 196 | } 197 | } 198 | 199 | tagsList := []string{} 200 | hasDeprecated := false 201 | for k := range tags { 202 | if k == "deprecated" { 203 | hasDeprecated = true 204 | continue 205 | } 206 | tagsList = append(tagsList, k) 207 | } 208 | sort.Strings(tagsList) 209 | if hasDeprecated { 210 | tagsList = append(tagsList, "deprecated") 211 | } 212 | 213 | type tagAdrs struct { 214 | Tag string 215 | Adrs []*ADR 216 | } 217 | 218 | renderList := []tagAdrs{} 219 | 220 | for _, tag := range tagsList { 221 | matched := []*ADR{} 222 | for _, adr := range adrs { 223 | for _, mt := range adr.Meta.Tags { 224 | if tag == mt { 225 | matched = append(matched, adr) 226 | } 227 | } 228 | } 229 | 230 | sort.Slice(matched, func(i, j int) bool { 231 | return matched[i].Meta.Index < matched[j].Meta.Index 232 | }) 233 | 234 | renderList = append(renderList, tagAdrs{Tag: tag, Adrs: matched}) 235 | } 236 | 237 | funcMap := template.FuncMap{ 238 | "join": func(i []string) string { 239 | return strings.Join(i, ", ") 240 | }, 241 | "title": func(i string) string { 242 | return strings.Title(i) 243 | }, 244 | } 245 | 246 | readme, err := template.New(".readme.templ").Funcs(funcMap).ParseFiles(".readme.templ") 247 | if err != nil { 248 | return err 249 | } 250 | err = readme.Execute(os.Stdout, renderList) 251 | if err != nil { 252 | return err 253 | } 254 | return nil 255 | } 256 | 257 | func main() { 258 | dir, err := os.ReadDir("adr") 259 | if err != nil { 260 | panic(err) 261 | } 262 | 263 | adrs := []*ADR{} 264 | 265 | for _, mdf := range dir { 266 | if mdf.IsDir() { 267 | continue 268 | } 269 | 270 | if path.Ext(mdf.Name()) != ".md" { 271 | continue 272 | } 273 | 274 | adr, err := parseADR(path.Join("adr", mdf.Name())) 275 | if err != nil { 276 | panic(err) 277 | } 278 | 279 | adrs = append(adrs, adr) 280 | } 281 | 282 | err = verifyUniqueIndexes(adrs) 283 | if err != nil { 284 | panic(err) 285 | } 286 | 287 | err = renderIndexes(adrs) 288 | if err != nil { 289 | panic(err) 290 | } 291 | } 292 | -------------------------------------------------------------------------------- /adr/ADR-4.md: -------------------------------------------------------------------------------- 1 | # NATS Message Headers 2 | 3 | | Metadata | Value | 4 | |----------|----------------------------| 5 | | Date | 2021-05-12 | 6 | | Author | @aricart, @scottf, @tbeets | 7 | | Status | Implemented | 8 | | Tags | server, client | 9 | 10 | 11 | | Revision | Date | Description | 12 | |----------|------------|------------------| 13 | | 1 | 2021-05-12 | Initial document | 14 | | 2 | 2025-05-15 | Clarified ASCII | 15 | 16 | ## Context 17 | 18 | This document describes NATS Headers from the perspective of clients. NATS 19 | headers allow clients to specify additional meta-data in the form of headers. 20 | NATS headers are similar to 21 | [HTTP Headers](https://tools.ietf.org/html/rfc7230#section-3.2) with some important differences. 22 | 23 | As with HTTP headers: 24 | 25 | - Each header field consists of a field name followed by a 26 | colon (`:`), optional leading whitespace, the field value, and optional 27 | trailing whitespace. 28 | - No spaces are allowed between the header field name and colon. 29 | - Field value may be preceded or followed by optional whitespace. 30 | - The specification may allow any number of strange things like comments/tokens 31 | etc. 32 | - The keys can repeat. 33 | 34 | More specifically from [rfc822](https://www.ietf.org/rfc/rfc822.txt) Section 35 | 3.1.2: 36 | 37 | > Once a field has been unfolded, it may be viewed as being composed of a 38 | > field-name followed by a colon (":"), followed by a field-body, and terminated 39 | > by a carriage-return/line-feed. The field-name must be composed of printable 40 | > ASCII characters (i.e., characters that have values between 33. and 126., 41 | > decimal, except colon). The field-body may be composed of any ASCII 42 | > characters, except CR or LF. (While CR and/or LF may be present in the actual 43 | > text, they are removed by the action of unfolding the field.) 44 | 45 | and Section 3.3: 46 | ```test 47 | ; ( Octal, Decimal.) 48 | CHAR = ; ( 0-177, 0.-127.) 49 | ``` 50 | 51 | ### Unique to NATS Headers 52 | 53 | ###### Version header 54 | Instead of an HTTP method followed by a resource, and the HTTP version (`GET / HTTP/1.1`), 55 | NATS provides a string identifying the header version (`NATS/X.x`), 56 | currently 1.0, so it is rendered as `NATS/1.0␍␊`. 57 | 58 | ###### Case preserving 59 | NATS treats application headers as a part of the message _payload_ and is agnostic to the 60 | application use-case between publishers and subscribers; therefore, NATS headers are _case preserving_. 61 | The server will not change the case in message conveyance, the publisher's case will be preserved. 62 | 63 | Any case sensitivity in header interpretation is the responsibility of the application and client participants. 64 | 65 | > Note: This is _different_ from HTTP headers which declare/define that web server and user-agent participants should ignore case. 66 | 67 | With above caveats, please refer to the 68 | [specification](https://tools.ietf.org/html/rfc7230#section-3.2) for information 69 | on how to encode/decode HTTP headers. 70 | 71 | ### Enabling Message Headers 72 | 73 | The server that is able to send and receive headers will specify so in it's 74 | [`INFO`](https://docs.nats.io/nats-protocol/nats-protocol#info) protocol 75 | message. The `headers` field if present, will have a boolean value. If the 76 | client wishes to send headers, it has to enable it must add a `headers` field 77 | with the `true` value in its 78 | [`CONNECT` message](https://docs.nats.io/nats-protocol/nats-protocol#connect): 79 | 80 | ``` 81 | "lang": "node", 82 | "version": "1.2.3", 83 | "protocol": 1, 84 | "headers": true, 85 | ... 86 | ``` 87 | 88 | ### Publishing Messages With A Header 89 | 90 | Messages that include a header have a `HPUB` protocol: 91 | 92 | ``` 93 | HPUB SUBJECT REPLY 23 30␍␊NATS/1.0␍␊Header: X␍␊␍␊PAYLOAD␍␊ 94 | HPUB SUBJECT REPLY 23 23␍␊NATS/1.0␍␊Header: X␍␊␍␊␍␊ 95 | HPUB SUBJECT REPLY 48 55␍␊NATS/1.0␍␊Header1: X␍␊Header1: Y␍␊Header2: Z␍␊␍␊PAYLOAD␍␊ 96 | HPUB SUBJECT REPLY 48 48␍␊NATS/1.0␍␊Header1: X␍␊Header1: Y␍␊Header2: Z␍␊␍␊␍␊ 97 | 98 | HPUB [REPLY] 99 |
100 | ``` 101 | 102 | #### NOTES: 103 | 104 | - `HDR_LEN` includes the entire serialized header, from the start of the version 105 | string (`NATS/1.0`) up to and including the ␍␊ before the payload 106 | - `TOT_LEN` the payload length plus the HDR_LEN 107 | 108 | ### MSG with Headers 109 | 110 | Clients will see `HMSG` protocol lines for `MSG`s that contain headers 111 | 112 | ``` 113 | HMSG SUBJECT 1 REPLY 23 30␍␊NATS/1.0␍␊Header: X␍␊␍␊PAYLOAD␍␊ 114 | HMSG SUBJECT 1 REPLY 23 23␍␊NATS/1.0␍␊Header: X␍␊␍␊␍␊ 115 | HMSG SUBJECT 1 REPLY 48 55␍␊NATS/1.0␍␊Header1: X␍␊Header1: Y␍␊Header2: Z␍␊␍␊PAYLOAD␍␊ 116 | HMSG SUBJECT 1 REPLY 48 48␍␊NATS/1.0␍␊Header1: X␍␊Header1: Y␍␊Header2: Z␍␊␍␊␍␊ 117 | 118 | HMSG [REPLY] 119 | 120 | ``` 121 | 122 | - `HDR_LEN` includes the entire serialized header, from the start of the version 123 | string (`NATS/1.0`) up to and including the ␍␊ before the payload 124 | - `TOT_LEN` the payload length plus the HDR_LEN 125 | 126 | ## Decision 127 | 128 | Implemented and merged to master. 129 | 130 | ## Consequences 131 | 132 | Use of headers is possible. 133 | 134 | ## Compatibility Across NATS Clients 135 | 136 | The following is a list of features to insure compatibility across NATS clients 137 | that support headers. Because the feature in Go client and nats-server leverage 138 | the Go implementation as described above, the API used will determine how header 139 | names are serialized. 140 | 141 | ### Case-sensitive Operations 142 | 143 | In order to promote compatibility across clients, this section describes how 144 | clients should behave. All operations are _case-sensitive_. Application implementations 145 | should provide an option(s) to enable clients to work in a case-insensitive or 146 | format header names canonically. 147 | 148 | #### Reading Values 149 | 150 | `GET` and `VALUES` are case-sensitive operations. 151 | 152 | - `GET` returns a `string` of the first value found matching the specified key 153 | in a case-sensitive lookup or an empty string. 154 | - `VALUES` returns a list of all values that case-sensitive match the specified 155 | key or an empty/nil/null list. 156 | 157 | #### Setting Values 158 | 159 | - `APPEND` is a case-sensitive, and case-preserving operation. The header is set 160 | exactly as specified by the user. 161 | - `SET` and `DELETE` are case-sensitive: 162 | - `DELETE` removes headers in case-sensitive operation 163 | - `SET` can be considered the result of a `DELETE` followed by an `APPEND`. 164 | This means only exact-match keys are deleted, and the specified value is 165 | added under the specified key. 166 | 167 | #### Case-insensitive Option 168 | 169 | The operations `GET`, `VALUES`, `SET`, `DELETE`, `APPEND` in the presence of a 170 | `case-insensitive` match requirement, will operate on equivalent matches. 171 | 172 | This functionality is constrained as follows: 173 | 174 | - `GET` returns the first matching header value in a case-insensitive match. 175 | - `VALUES` returns the union of all headers that case-insensitive match. If the 176 | exact key is not found, an empty/nil/null list is returned. 177 | - `DELETE` removes the all headers that case-insensitive match the specified 178 | key. 179 | - `SET` is the combination of a case-insensitive `DELETE` followed by an 180 | `APPEND`. 181 | - `APPEND` will use the first matching key found and add values. If no key is 182 | found, values are added to a key preserving the specified case. 183 | 184 | > Note that case-insensitive operations are only suggested, and not required to be 185 | implemented by clients, specially if the implementation allows the user code to 186 | easily iterate over keys and values. 187 | 188 | ### Multiple Header Values Serialization 189 | 190 | When serializing, entries that have more than one value should be serialized one 191 | per line. While the http Header standard, prefers values to be a comma separated 192 | list, this introduces additional parsing requirements and ambiguity from client 193 | code. HTTP itself doesn't implement this requirement on headers such as 194 | `Set-Cookie`. Libraries, such as Go, do not interpret comma-separated values as 195 | lists. 196 | -------------------------------------------------------------------------------- /adr/ADR-2.md: -------------------------------------------------------------------------------- 1 | # NATS Typed Messages 2 | 3 | |Metadata|Value| 4 | |--------|-----| 5 | |Date |2020-05-06| 6 | |Author |@ripienaar| 7 | |Status |Implemented| 8 | |Tags |jetstream, server, client| 9 | 10 | ## Context 11 | 12 | NATS Server has a number of JSON based messages - monitoring, JetStream API and more. These are consumed, 13 | and in the case of the API produced, by 3rd party systems in many languages. To assist with standardization 14 | of data validation, variable names and more we want to create JSON Schema documents for all our outward facing 15 | JSON based communication. Specifically this is not for server to server communication protocols. 16 | 17 | This effort is ultimately not for our own use - though libraries like `jsm.go` will use these to do validation 18 | of inputs - this is about easing interoperability with other systems and to eventually create a Schema Registry. 19 | 20 | There are a number of emerging formats for describing message content: 21 | 22 | * JSON Schema - transport agnostic way of describing the shape of JSON documents 23 | * AsyncAPI - middleware specific API description that uses JSON Schema for payload descriptions 24 | * CloudEvents - standard for wrapping system specific events in a generic, routable, package. Supported by all 25 | major Public Clouds and many event gateways. Can reference JSON Schema. 26 | * Swagger / OpenAPI - standard for describing web services that uses JSON Schema for payload descriptions 27 | 28 | In all of these many of the actual detail like how to label types of event or how to version them are left up 29 | to individual projects to solve. This ADR describes how we are approaching this. 30 | 31 | ## Decision 32 | 33 | ### Overview 34 | 35 | We will start by documenting our data types using JSON Schema Draft 7. AsyncAPI and Swagger can both reference 36 | these documents using remote references so this, as a starting point, gives us most flexibility and interoperability 37 | to later create API and Transport specific schemas that reference these. 38 | 39 | We define 2 major type of typed message: 40 | 41 | * `Message` - any message with a compatible `type` hint embedded in it 42 | * `Event` - a specialized `message` that has timestamps and event IDs, suitable for transformation to 43 | Cloud Events. Typically, published unsolicited. 44 | 45 | Today NATS Server do not support publishing Cloud Events natively however a bridge can be created to publish 46 | those to other cloud systems using the `jsm.go` package that supports converting `events` into Cloud Event format. 47 | 48 | ### Message Types 49 | 50 | There is no standard way to indicate the schema of a specific message. We looked at a lot of prior art from CNCF 51 | projects, public clouds and more but found very little commonality. The nearest standard is the Uniform Resource Name 52 | which still leaves most of the details up to the project and does not conventionally support versioning. 53 | 54 | We chose a message type like `io.nats.jetstream.api.v1.consumer_delete_response`, `io.nats.server.advisory.v1.client_connect` 55 | or `io.nats.unknown_message`. 56 | 57 | `io.nats.unknown_message` is a special type returned for anything without valid type hints. In go that implies 58 | `map[string]interface{}`. 59 | 60 | The structure is as follows: io.nats.``.``.v``.`` 61 | 62 | #### Source 63 | 64 | The project is the overall originator of a message and should be short but descriptive, today we have 2 - `server` and ` 65 | jetstream` - as we continue to build systems around Stream Processing and more we'd add more of these types. I anticipate 66 | for example adding a few to Surveyor for publishing significant lifecycle events. 67 | 68 | Generated Cloud Events messages has the `source` set to `urn:nats:`. 69 | 70 | |Project|Description| 71 | |-------|-----------| 72 | |`server`|The core NATS Server excluding JetStream related messages| 73 | |`jetstream`|Any JetStream related message| 74 | 75 | #### Category 76 | 77 | The `category` groups messages by related sub-groups of the `source`, often this also appears in the subjects 78 | these messages get published to. 79 | 80 | This is a bit undefined, examples in use now are `api`, `advisory`, `metric`. Where possible try to fit in with 81 | existing chosen ones, if none suits update this table with your choice and try to pick generic category names. 82 | 83 | |Category|Description| 84 | |----|-----------| 85 | |`api`|Typically these are `messages` used in synchronous request response APIs| 86 | |`advisory`|These are `events` that describe a significant event that happened like a client connecting or disconnecting| 87 | |`metric`|These are `events` that relate to monitoring - how long did it take a message to be acknowledged| 88 | 89 | #### Versioning 90 | 91 | The ideal outcome is that we never need to version any message and maintain future compatibility. 92 | 93 | We think we can do that with the JetStream API. Monitoring, Observability and black box management is emerging, and we 94 | know less about how that will look in the long run, so we think we will need to version those. 95 | 96 | The philosophy has to be that we only add fields and do not significantly change the meaning of existing ones, this 97 | means the messages stay `v1`, but major changes will require bumps. So all message types includes a single digit version. 98 | 99 | #### Message Name 100 | 101 | Just a string identifying what this message is about - `client_connect`, `client_disconnect`, `api_audit` etc. 102 | 103 | ## Examples 104 | 105 | ### Messages 106 | 107 | At minimum a typed message must include a `type` string: 108 | 109 | ```json 110 | { 111 | "type": "io.nats.jetstream.api.v1.stream_configuration" 112 | } 113 | ``` 114 | 115 | Rest of the document is up to the specific use case 116 | 117 | ### Advisories 118 | 119 | Advisories must include additional fields: 120 | 121 | ```json 122 | { 123 | "type": "io.nats.jetstream.advisory.v1.api_audit", 124 | "id": "uafvZ1UEDIW5FZV6kvLgWA", 125 | "timestamp": "2020-04-23T16:51:18.516363Z" 126 | } 127 | ``` 128 | 129 | * `timestamp` - RFC 3339 format in UTC timezone, with sub-second precision added if present 130 | * `id` - Any sufficiently unique ID such as those produced by `nuid` 131 | 132 | ### Errors 133 | 134 | Any `message` can have an optional `error` property if needed and can be specified in the JSON Schema, 135 | they are not a key part of the type hint system which this ADR focus on. 136 | 137 | In JetStream [ADR 0001](ADR-1.md) we define an error message as this: 138 | 139 | ``` 140 | { 141 | "error": { 142 | "description": "Server Error", 143 | "code": 500 144 | } 145 | } 146 | ``` 147 | 148 | Where error codes follow basic HTTP standards. This `error` object is not included on success and so 149 | acceptable error codes are between `300` and `599`. 150 | 151 | It'll be advantageous to standardise around this structure, today only JetStream API has this and we have 152 | not evaluated if this will suit all our needs. 153 | 154 | ## Schema Storage 155 | 156 | Schemas will eventually be kept in some form of formal Schema registry. In the near future they will all be placed as 157 | fully dereferenced JSON files at `http://nats.io/schemas`. 158 | 159 | The temporary source for these can be found in the `nats-io/jetstream` repository including tools to dereference the 160 | source files. 161 | 162 | ## Usage 163 | 164 | Internally the `jsm.go` package use these Schemas to validate all requests to the JetStream API. This is not required as 165 | the server does its own validation too - but it's nice to fail fast and give extended errors like a JSON validator will 166 | give. 167 | 168 | Once we add JetStream API support to other languages it would be good if those languages use the same Schemas for 169 | validation to create a unified validation strategy. 170 | 171 | Eventually these Schemas could be used to generate the API structure. 172 | 173 | The `nats` utility has a `nats events` command that can display any `event`. It will display any it finds, special 174 | formatting can be added using Golang templates in its source. Consider adding support to it whenever a new `event` is added. 175 | 176 | ## Status 177 | 178 | While this is marked `accepted`, we're still learning and exploring their usage so changes should be anticipated. 179 | 180 | ## Consequences 181 | 182 | Many more aspects of the Server move into the realm of being controlled and versioned where previously we took a much 183 | more relaxed approach to modifications to the data produced by `/varz` and more. 184 | -------------------------------------------------------------------------------- /adr/ADR-21.md: -------------------------------------------------------------------------------- 1 | # NATS Configuration Contexts 2 | 3 | | Metadata | Value | 4 | |----------|-----------------------| 5 | | Date | 2021-12-14 | 6 | | Author | @ripienaar | 7 | | Status | Partially Implemented | 8 | | Tags | client | 9 | 10 | ## Background 11 | 12 | A `nats context` is a named configuration stored in a configuration file allowing a set of related configuration items to be stored and accessed later. 13 | 14 | In the `nats` CLI this is used extensively, for example `nats stream ls --context orders` would load the `orders` context and configure items such as login credentials, servers, domains, API prefixes and more. 15 | 16 | The intention of the ADR is to document the storage of these contexts so that clients can, optionally, support using them. 17 | 18 | ## Version History 19 | 20 | | Date | Revision | 21 | |------------|--------------------------------------------| 22 | | 2020-08-12 | Initial basic design | 23 | | 2020-05-07 | JetStream Domains | 24 | | 2021-12-13 | Custom Inbox Prefix | 25 | | 2024-12-03 | Windows Cert Store, User JWT and TLS First | 26 | 27 | This reflects a current implementation in use widely via the CLI as such it's a stable release. Only non breaking additions will be considered. 28 | 29 | ## Design 30 | 31 | Today the design is entirely file based for maximum portability, later we can consider other options like S3 buckets, KV stores etc. 32 | 33 | ### Configuration Paths 34 | 35 | There is generally no standard for what goes on in a users home directory on a Unix system. Recently the Free Desktop team have been working on [XDG Base Directory Specification](https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html) that specifies in detail where configuration, data, binaries and more are to be stored in a way that's compatible with Linux desktops like KDE and Gnome but also with systems such as systemd. 36 | 37 | We therefore based the design on this specification as a widely supported standard. 38 | 39 | |File|Description| 40 | |----|-----------| 41 | |`~/.config`|The default location for user configuration, configurable using `XDG_CONFIG_HOME`| 42 | |`~/.config/nats`|Where all NATS related user configuration should go| 43 | |`~/.config/nats/context.txt`|The current selected (default) context, would contain just `ngs`| 44 | |`~/.config/nats/context/ngs.json`|The configuration for the `ngs` context| 45 | 46 | While this is Linux centered it does work on Windows, we might want to consider a more typical path to replace `~/.config` there and keep the rest as above. 47 | 48 | ### Context content 49 | 50 | The `~/.config/nats/context/ngs.json` file has the following JSON fields: 51 | 52 | | Key | Default | Description | 53 | |--------------------------|-------------------------|----------------------------------------------------------------------------------------------------------| 54 | | `description` | | A human friendly description for the specific context | 55 | | `url` | `nats://localhost:4222` | Comma seperated list of server urls | 56 | | `token` | | Authentication token | 57 | | `user` | | The username to connect with, requires a password | 58 | | `password` | | Password to connect with | 59 | | `creds` | | Path to a NATS Credentials file | 60 | | `nkey` | | Path to a NATS Nkey file | 61 | | `cert` | | Path to the x509 public certificate | 62 | | `key` | | Path to the x509 private key | 63 | | `ca` | | Path to the x509 Certificate Authority | 64 | | `nsc` | | A `nsc` resolve url for loading credentials and server urls | 65 | | `jetstream_domain` | | The JetStream Domain to use | 66 | | `jetstream_api_prefix` | | The JetStream API Prefix to use | 67 | | `jetstream_event_prefix` | | The JetStream Event Prefix | 68 | | `inbox_prefix` | | A prefix to use when generating inboxes | 69 | | `user_jwt` | | The user JWT token | 70 | | `tls_first` | | Enables the use of TLS on Connect rather than historical INFO first approach | 71 | | `windows_cert_store` | | The Windows cert store to use for access to the TLS files, `windowscurrentuser` or `windowslocalmachine` | 72 | | `windows_cert_match_by` | | Which certificate to use inside the store | 73 | | `windows_cert_match` | | How certificates are searched for in the store, `subject` or `issuer` | 74 | | `windows_ca_certs_match` | | Which Certificate Authority to use inside the store | 75 | 76 | All fields are optional, none are marked as `omitempty`, users wishing to edit these with an editor should known all valid key names. 77 | 78 | Above settings map quite obviously to client features with the exception of `nsc`, the `nsc` key takes a URL like value, examples are: 79 | 80 | * `nsc://operator` 81 | * `nsc://operator/account` 82 | * `nsc://operator/account/user` 83 | * `nsc://operator/account/user?operatorSeed&accountSeed&userSeed` 84 | * `nsc://operator/account/user?operatorKey&accountKey&userKey` 85 | * `nsc://operator?key&seed` 86 | * `nsc://operator/account?key&seed` 87 | * `nsc://operator/account/user?key&seed` 88 | * `nsc://operator/account/user?store=/a/.nsc/nats&keystore=/foo/.nkeys` 89 | 90 | The context invokes `nsc generate profile `, the responce will be non zero exit code for error else a structure like: 91 | 92 | ```json 93 | { 94 | "user_creds": "", 95 | "operator" : { 96 | "service": "hostport" 97 | } 98 | } 99 | ``` 100 | 101 | This will configure the `creds` and `url` parts of the context. If either `url` or `creds` is specifically configured in the Context those will override the answer from `nsc`. 102 | 103 | See `nsc generate profile --help` for details. 104 | 105 | ## Sample Usage APIs 106 | 107 | I don't think we really want to dictate standard APIs here, in Go we have 2 main ways to use the package. 108 | 109 | It's also fine to delegate management of these to the `nats` CLI. 110 | 111 | A very basic and quick way to just connect to a specific context: 112 | 113 | ```go 114 | nc, _ := Connect(os.GetEnv("CONTEXT"), nats.MaxReconnects(-1)) 115 | ``` 116 | 117 | Here Connect will construct `[]nats.Option` based on the context settings and append the user supplied ones to the list. 118 | 119 | A way to load a context and optionally override some options: 120 | 121 | ```go 122 | nctx, _ := Load(os.GetEnv("CONTEXT"), WithServerURL("nats://other:4222")) 123 | nc, _ := nctx.Connect(nats.MaxReconnects(-1)) 124 | ``` 125 | 126 | When `name` is empty it will use the current selected context, if no context is selected and name is empty it's an error. 127 | -------------------------------------------------------------------------------- /adr/ADR-30.md: -------------------------------------------------------------------------------- 1 | # Subject Transform 2 | 3 | | Metadata | Value | 4 | |----------|-----------------------------------| 5 | | Date | 2022-07-17 | 6 | | Author | @jnmoyne, @derekcollison, @tbeets | 7 | | Status | Implemented | 8 | | Tags | server | 9 | 10 | ## Context and Problem Statement 11 | 12 | As part of multiple technical implementations of the NATS Server, there is a need to create a mapping formula, or _ 13 | transform_, that 14 | can be applied to an input subject, yielding a desired output subject. 15 | 16 | Transforms can be used as part of: 17 | 18 | * Core NATS Subject mapping at the account level 19 | * JetStream stream definition 20 | * JetStream sourcing 21 | * JetStream RePublish 22 | * Cross-account service and stream mapping 23 | * Shadow subscriptions (e.g. across a leaf) 24 | 25 | A subject transform shall be defined by a _Source_ filter that defines what input subjects are eligible (via match) to 26 | be 27 | transformed and a _Destination_ subject mapping format that defines the notional subject filter that the _transformed_ 28 | output subject will match. 29 | 30 | Input subject tokens that match Source wildcard(s) "survive" the transformation and can be used as input to 'mapping 31 | functions' whose output is used to build the output subject. 32 | 33 | ## Design 34 | 35 | Destination, taken together with Source, form a valid subject token transform. The resulting transform 36 | is applied to an input subject (that matches Source subject filter) to determine the output subject. 37 | 38 | ### Weighted and cluster-scoped mappings 39 | 40 | In the case of Core NATS subject mapping at the account level, you can actually have more than one mapping destination per source. 41 | Each of those mappings has a 'weight' (a percentage between 0 and 100%), for a total of 100%. That weight percentage indicate the likeliness of that mapping being used. 42 | 43 | Furthermore, (as of 2.10) weighted mappings can be cluster-scoped meaning that you can also create mapping destinations (for a total of 100% per cluster name) that apply (and take precedence) when the server is part of the cluster specified. This allows administrators to define mappings that change depending upon the cluster where the message is initially published from. 44 | 45 | For example consider the following mappings: 46 | 47 | ``` 48 | "foo":[ 49 | {destination:"foo.west", weight: 100%, cluster: "west"}, 50 | {destination:"foo.central", weight: 100%, cluster: "central"}, 51 | {destination:"foo.east", weight: 100%, cluster: "east"}, 52 | {destination:"foo.elsewhere", weight: 100%} 53 | ], 54 | ``` 55 | 56 | Means that an application publishing a message on subject `foo` will result in the message being published over Core NATS with the subject: 57 | - `foo.west` if the application is connected to any server in the `west` cluster 58 | - `foo.central` if the application is connected to any server in the `central` cluster 59 | - `foo.east` if the application is connected to any server in the `east` cluster 60 | 61 | You can also define 100%'s worth of destinations as a catch-all for servers that apply for the other clusters in the Super-Cluster (or if the server is not running in clustered mode). In the example above a message published from cluster `south` would be mapped to `foo.elsewhere`. 62 | 63 | ### Transform rules 64 | 65 | A given input subject must match the Source filter (in the usual subscription-interest way) for the transform to be 66 | valid. 67 | 68 | For a valid input subject: 69 | 70 | * Source-matching literal-token positions are ignored, i.e. they are not usable by mapping functions. 71 | * Source-matching wildcard-token positions (if any) are used by mapping functions in the transform Destination format. 72 | * Source-matching `*` wildcard (single token) tokens are passed to Destination mapping functions by wildcard cardinal 73 | position number. E.g. using `$x` or `{{wildcard(x)}}` notation, where `x` is a number between 1 (first instance of 74 | a `*` wildcard-token in the Source filter) and `x` (x's instance of a `*` wildcard-token in the Source filter). 75 | * Source-matching `>` wildcard (multi token) tokens are mapped to the respective `>` position in the Destination format 76 | * Literal tokens in the Destination format are mapped to the output subject unchanged (position and value) 77 | 78 | ### Using all the wildcard-tokens in the transform's Source 79 | 80 | * For transforms that are defined in inter-account imports (streams and services) the destinations _MUST_ make use of _ALL_ of the wildcard-tokens present in the transform's Source. 81 | * However, starting with version 2.10, for transforms used any other place (i.e. Core NATS account mappings, Subject transforms in streams, stream imports and stream republishing) it is allowed to drop any number of wildcard-tokens. 82 | 83 | ## Mapping Functions 84 | 85 | Mapping functions and placed in the transform Destination format as subject tokens. Their format 86 | is `{{MappingFunction()}}` and the legacy `$x` notation (equivalent to `{{Wildcard(x)}}`) is still supported for 87 | backwards compatibility. 88 | 89 | > Note: Mapping functions names are valid in both upper CamelCase and all lower case (i.e. both `{{Wildcard(1)}}` 90 | > and `{{wildcard(1)}}` are valid) 91 | 92 | > For transforms defined in inter-accounts imports (streams and services) the _ONLY_ allowed mapping function is `{{Wildcard(x)}}` (or the legacy `$x`) 93 | 94 | ### List of Mapping Functions 95 | 96 | Currently (2.9.0) the following mapping functions are available: 97 | 98 | * `{{Wildcard(x)}}` outputs the value of the token for wildcard-token index `x` (equivalent to the legacy `$x` notation) 99 | * `{{Partition(x,a,b,c,...)}}` ouputs a partition number between `0` and `x-1` assigned from a deterministic hashing of 100 | the value of wildcard-tokens `a`, `b` and `c` 101 | * `{{Split(x,y)}}` splits the value of wildcard-token `x` into multiple tokens on the presence of character `y` 102 | * `{{SplitFromLeft(x,y)}}` splits in two the value of wildcard-token `x` at `y` characters starting from the left 103 | * `{{SplitFromRight(x,y)}}` splits in two the value of wildcard-token `x` at `y` characters starting from the right 104 | * `{{SliceFromLeft(x,y)}}` slices into multiple tokens the value of wildcard-token `x`, every `y` characters starting 105 | from the left 106 | * `{{SliceFromRight(x,y)}}` slices into multiple tokens the value of wildcard-token `x`, every `y` characters starting 107 | from the right 108 | 109 | ### Example transforms 110 | 111 | | Input Subject | Source filter | Destination format | Output Subject | 112 | |:--------------------------|------------------------|-----------------------------------------|-------------------------------------| 113 | | `one.two.three` | `">"` | `"uno.>"` | `uno.one.two.three` | 114 | | `four.five.six` | `">"` | `"eins.>"` | `eins.four.five.six` | 115 | | `one.two.three` | `">"` | `">"` | `one.two.three` | 116 | | `four.five.six` | `">"` | `"eins.zwei.drei.vier.>"` | `eins.zwei.drei.vier.four.five.six` | 117 | | `one.two.three` | `"one.>"` | `"uno.>"` | `uno.two.three` | 118 | | `one.two.three` | `"one.two.>"` | `"uno.dos.>`"` | `uno.dos.three` | 119 | | `one` | `"one"` | `"uno"` | `uno` | 120 | | `one.two.three.four.five` | `"one.*.three.*.five"` | `"uno.$2.$1"` | `uno.four.two` | 121 | | `one.two.three.four.five` | `"one.*.three.*.five"` | `"uno.{{wildcard(2)}}.{{wildcard(1)}}"` | `uno.four.two` | 122 | | `one.two.three.four.five` | `"*.two.three.>"` | `"uno.$1.>"` | `uno.one.four.five` | 123 | | `-abc-def--ghij-` | `"*"` | `"{{split(1,-)}}"` | `abc.def.ghi` | 124 | | `12345` | `"*"` | `"{{splitfromleft(1,3)}}"` | `123.45` | 125 | | `12345` | `"*"` | `"{{SplitFromRight(1,3)}}"` | `12.345` | 126 | | `1234567890` | `"*"` | `"{{SliceFromLeft(1,3)}}"` | `123.456.789.0` | 127 | | `1234567890` | `"*"` | `"{{SliceFromRight(1,3)}}"` | `1.234.567.890` | 128 | 129 | > Note: The NATS CLI provides a utility, `server mappings` for experimenting with different transforms and input 130 | > subjects. 131 | -------------------------------------------------------------------------------- /adr/ADR-39.md: -------------------------------------------------------------------------------- 1 | # Certificate Store 2 | 3 | | Metadata | Value | 4 | |----------|------------------| 5 | | Date | 2023-06-22 | 6 | | Author | @tbeets | 7 | | Status | Implemented | 8 | | Tags | server, security | 9 | 10 | ## Problem Statement 11 | Some users need to source the NATS Server's TLS identity from a _credential store_ rather than a file. This may 12 | be for either client (mTLS) or server identity use cases. Need is driven either by an 13 | insitu credential store and/or organizational policy that disallows deploying secrets as an operating system file. 14 | 15 | Edge computing scenarios involving large "fleets" of managed servers may especially require credential store 16 | integration. 17 | 18 | ## Context 19 | A credential store may be offered as a supported operating system service such as the 20 | [Microsoft Software Key Storage Provider](https://learn.microsoft.com/en-us/windows/win32/seccertenroll/cng-key-storage-providers), or by 21 | a 3rd-party trusted platform module (TPM) provider. Some credential store providers may implement a standards-based interface such as 22 | [PKCS #11 Cryptographic Token Interface](http://docs.oasis-open.org/pkcs11/pkcs11-base/v2.40/os/pkcs11-base-v2.40-os.html). 23 | 24 | NATS Server requires a configuration options interface for operators to specify a specific credential store provider, in a TLS 25 | configuration block (or TLS Map), with provider-specific identity parameters. 26 | 27 | ## Design 28 | The following configuration properties are added to the [TLS map](https://docs.nats.io/running-a-nats-service/configuration/securing_nats/tls): 29 | 30 | ### Properties in TLS Map 31 | 32 | | Property | Description | Default | Example Value | 33 | |---------------|---------------------------------------------------------------|-----------|----------------------| 34 | | cert_store | a supported credential store provider (see Enabled Providers) | | "WindowsCurrentUser" | 35 | | cert_match_by | provider-specific identity search/lookup option | "Subject" | "Subject" | 36 | | cert_match | identity search/lookup term | | "example.com" | 37 | 38 | If the `cert_store` configuration properties are used in a given TLS map, it logically takes the place of `cert_file` and 39 | `key_file` properties. 40 | 41 | If the operator specifies both `cert_store` and `cert_file` properties in the same TLS map, the server will error on 42 | startup with message `'cert_file' and 'cert_store' may not both be configured`. 43 | 44 | For a given TLS map, if `cert_store` is configured, the `key_file` property, if present, will be ignored. 45 | 46 | > Note: provider name is case-insensitive 47 | 48 | If a `cert_store` provider unknown to the NATS Server (and the specific operating system build) is configured, the server will error at startup with message 49 | `cert store type not implemented`. 50 | 51 | ### Enabled Cert Store Providers 52 | 53 | | Provider Name | Description | Operating System | 54 | |-----------------------|---------------------------------------------------------------------------------------|------------------| 55 | | `WindowsCurrentUser` | Microsoft Software Key Storage Provider, local "MY" (Personal) store of Current User | Windows | 56 | | `WindowsLocalMachine` | Microsoft Software Key Storage Provider, local "MY" (Personal) store of Local Machine | Windows | 57 | 58 | ### Windows Providers 59 | 60 | The Microsoft Software Key Storage Provider (KSP) is accessed by NATS Server at startup and on-demand when TLS-negotiation signatures are required. 61 | 62 | The two providers differ only by the operating system access and permissions scope required and the 63 | specific "store" of credentials (certificates and related private key) that are sourced from the provider: 64 | 65 | - `WindowsCurrentUser` - The "MY" store associated with the _operating system user_ of the NATS Server process. 66 | - `WindowsLocalMachine` - The "MY" store associated with the _local machine_. The NATS Server process user must have the necessary Windows entitlement to access the local machine's certificate store. 67 | 68 | > Note: The "MY" store on Windows is what appears as Personal->Certificates in the Microsoft Management Console (Certificates snap-in). 69 | 70 | The Windows-build of NATS Server has been enhanced to directly leverage the Windows Security & Identity library functions. 71 | APIs from libraries `ncrypt.dll` and `crypt32.dll` are invoked to find and retrieve public certificates at startup and perform signatures during TLS negotiation. 72 | 73 | ##### Inclusion of Intermediate CA Certificates 74 | 75 | When a leaf certificate is matched (see below Example Configurations), NATS server will attempt to source a valid 76 | trust chain of certificates from the local Windows machine's trust store, i.e. a valid chain from the leaf to a trusted 77 | self-signed certificate in the store (typically a CA root). 78 | 79 | If at least one valid chain is found, the first valid chain is selected and NATS server will form a final certificate as 80 | the matched leaf certificate plus non-self signed intermediate certificates that may be present in the valid chain. 81 | 82 | If no valid trust chain is found in the local Windows machine's trust store, the NATS server will form the final certificate as the matched leaf 83 | certificate only, no intermediate chained certs will be included. 84 | 85 | ##### Validation Policy 86 | 87 | Note that CRL, OCSP, explicit role validation (TLS server or TLS client) and other policy features are specifically avoided 88 | in certificate match (and intermediate population) against the Windows KSP, as these are ultimately provided by the eventual 89 | trust validator in TLS negotiation, i.e.this provider implements identity lookup and identity signature but is not itself 90 | the trust/policy validator of its own identity claims. 91 | 92 | #### Identity Lookup Options 93 | 94 | `cert_match_by` may be one of the following: 95 | 96 | - `Subject` - the KSP will compare the `cert_match` property value to each of the certificate's Subject RDN values and return the first match. See also: [CERT_FIND_SUBJECT_STR](https://learn.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-certfindcertificateinstore) 97 | - `Issuer` - the KSP will compare the `cert_match` property value to each of the certificate's Issuer RDN values and return the first match. See also: [CERT_FIND_ISSUER_STR](https://learn.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-certfindcertificateinstore) 98 | 99 | If the configured `cert_match_by` does not match an available provider option, the server will error with message `cert match by type not implemented`. 100 | 101 | #### Example Configurations 102 | 103 | Given a certificate provisioned to MY store with the following Issuer and Subject distinguished names: 104 | 105 | ```text 106 | Certificate: 107 | Data: 108 | Version: 3 (0x2) 109 | Serial Number: 110 | 7c:b1:37:8c:1a:70:1a:99:4e:50:37:29:6f:12:2c:bd:12:27:0c:64 111 | Signature Algorithm: sha256WithRSAEncryption 112 | Issuer: O = Synadia Communications Inc., OU = NATS.io, CN = localhost 113 | Validity 114 | Not Before: Feb 4 19:51:00 2019 GMT 115 | Not After : Feb 3 19:51:00 2024 GMT 116 | Subject: OU = NATS.io, CN = example.com 117 | Subject Public Key Info: 118 | ... 119 | ``` 120 | 121 | and TLS Map in a server configuration in format: 122 | 123 | ```text 124 | tls { 125 | cert_store: 126 | cert_match_by: 127 | cert_match: 128 | ... 129 | } 130 | ``` 131 | 132 | | cert_match_by | cert_match | Result | 133 | |---------------|-----------------------------------------------------------------|------------------| 134 | | "Subject" | "example.com" | Success (found) | 135 | | "Subject" | "NATS.io" | Success (found) | 136 | | "Subject" | "OU = NATS.io, CN = example.com" | Fail (not found) | 137 | | "Subject" | "CN = example.com" | Fail (not found) | 138 | | "Issuer" | "localhost" | Success (found) | 139 | | "Issuer" | "Synadia Communications Inc." | Success (found) | 140 | | "Issuer" | "O = Synadia Communications Inc., OU = NATS.io, CN = localhost" | Fail (not found) | 141 | | "Issuer" | "CN = localhost" | Fail (not found) | 142 | 143 | > Note: To avoid TLS negotiation failure caused by return of the wrong certificate, it's recommended to lookup by the 144 | > Subject value representing the unique name of your NATS Server's certificate identity, e.g. the CN value as in the 145 | > "example.com" case above. 146 | 147 | ## Futures 148 | 149 | The certificate store interface and new TLS Map configuration entries are intended to be extensible to future 150 | provider interfaces that NATS Server may implement. 151 | -------------------------------------------------------------------------------- /adr/ADR-36.md: -------------------------------------------------------------------------------- 1 | # Subject Mapping Transforms in Streams 2 | 3 | |Metadata| Value | 4 | |--------|---------------| 5 | |Date | 2023-02-10 | 6 | |Author | @jnmoyne | 7 | |Status | Implemented | 8 | |Tags | jetstream, client, server | 9 | 10 | ## Context and Problem Statement 11 | 12 | Subject mapping and transformation is only available at the Core NATS level, meaning that in order to define or modify mappings one has to either have access to the server config file, or have access to the account's key in operator security mode. While Core NATS subject mapping has its place and use (e.g. scaling a single stream for writes using partitioning, traffic routing, A/B and Canary testing), many (most) use cases for subject mapping happen in the context of streams, and having to go to the Core NATS server/account level to define subject mappings is quite limiting as it's not easy for an application programmer to be able to define the mappings he/she needs (even if they have access to the account's key). 13 | 14 | On the other hand allowing the application of subject mapping transforms at the stream level makes it very easy for the application developers or the NATS administrators to define and manage those mappings. There is more than one place in a stream's message flow where subject mapping transforms can be applied which enables some very interesting new functionalities (e.g. KV bucket sourcing). 15 | 16 | ## Prior Work 17 | 18 | See [ADR-30](ADR-30.md) for Core NATS subject mapping and a description of the available subject transform functions. 19 | 20 | ## Features introduced 21 | 22 | The new features introduced by version 2.10 of the NATS server allow the application of subject mapping transformations in multiple places in the stream configuration: 23 | 24 | - You can apply a subject mapping transformation as part of a Stream mirror. 25 | - You can apply a subject mapping transformation as part of a Stream source. 26 | - Amongst other use cases, this enables the ability to do sourcing between KV bucket (as the name of the bucket is part of the subject name in the KV bucket streams, and therefore has to be transformed during the sourcing as the name of the sourcing bucket is different from the name(s) of the bucket(s) being sourced). 27 | - You can apply a subject mapping transformation at the ingres (input) of the stream, meaning after it's been received on Core NATS, or mirrored or sourced from another stream, and before limits are applied (and it gets persisted). This subject mapping transformation is only that, it does not filter messages, it only transforms the subjects of the messages matching the subject mapping source. 28 | - This enables the ability to insert a partition number as a token in the message subjects. 29 | - You can also apply a subject mapping transformation as part of the re-publishing of messages. 30 | 31 | Subject mapping transformation can be seen as an extension of subject filtering, there can not be any subject mapping transformation without an associated subject filter. 32 | 33 | A subject filtering and mapping transform is composed of two parts: a subject filter (the 'source' part of the transform) and the destination transform (the 'destination' part of the transform). An empty (i.e. `""`) destination transform means _NO transformation_ of the subject. 34 | 35 | ![](images/stream-transform.png) 36 | 37 | Just like streams and consumers can now have more than one single subject filter, mirror and sources can have more than one set of subject filter and transform destination. 38 | 39 | Just like with consumers you can either specify a single subject filter and optional subject transform destination or an array of subject transform configs composed of a source filter and optionally empty transform destination. 40 | 41 | In addition, it is now possible to source not just from different streams but also from the same stream more than once. 42 | 43 | If you define a single source with multiple subject filters and transforms, in which case the ordering of the messages is guaranteed to be preserved, there can not be any overlap between the filters. If you define multiple sources from the same stream, subject filters can overlap between sources thereby making it possible to duplicate messages from the sourced stream, but the order of the messages between the sources is not guaranteed to be preserved. 44 | 45 | For example if a stream contains messages on subjects "foo", "bar" and "baz" and you want to source only "foo" and "bar" from that stream you could specify two subject transforms (with an empty destination) in a single source, or you can source twice from that stream once with the "foo" subject filter and a second time with the "bar" subject filter. 46 | 47 | ## Stream config structure changes 48 | 49 | From the user's perspective these features manifest themselves as new fields in the Stream Configuration request and Stream Info response messages. 50 | 51 | In Mirror and Sources : 52 | - Additional `"subject_transforms"` array in the `"sources"` array and in `"mirror"` containing objects made of two string fields: `"src"` and `"dest"`. Note that if you use the `"subject_transforms"` array then you can _NOT_ also use the single string subject filters. The `"dest"` can be empty or `""` in which case there is no transformation, just filtering. 53 | 54 | At the top level of the Stream Config: 55 | - Additional `"subject_transform"` field in Stream Config containing two strings: `"src"` and `"dest"`. 56 | 57 | ## KV bucket sourcing 58 | 59 | Subject transforms in streams open up the ability to do sourcing between KV buckets. The client library implements this by automatically adding the subject transform to the source configuration of the underlying stream for the bucket doing the sourcing. 60 | 61 | The transform in question should map the subject names from the sourced bucket name to the sourcing's bucket name. 62 | 63 | e.g. if bucket B sources A the transform config for the source from stream A in stream B should have the following transform in the SubjectTransforms array for that StreamSource: 64 | ``` 65 | { 66 | "src": "$KV.A.>", 67 | "dest": "$KV.B.>" 68 | } 69 | ``` 70 | 71 | ## Examples 72 | 73 | A stream that mirrors the `sourcedstream` stream using two subject filters and transform (in this example `foo` is transformed, but `bar` is not): 74 | 75 | ```JSON 76 | { 77 | "name": "sourcingstream", 78 | "retention": "limits", 79 | "max_consumers": -1, 80 | "max_msgs_per_subject": -1, 81 | "max_msgs": -1, 82 | "max_bytes": -1, 83 | "max_age": 0, 84 | "max_msg_size": -1, 85 | "storage": "file", 86 | "discard": "old", 87 | "num_replicas": 1, 88 | "duplicate_window": 120000000000, 89 | "mirror": 90 | { 91 | "name": "sourcedstream", 92 | "subject_transforms": [ 93 | { 94 | "src": "foo", 95 | "dest": "foo-transformed" 96 | }, 97 | { 98 | "src": "bar", 99 | "dest": "" 100 | } 101 | ] 102 | }, 103 | "sealed": false, 104 | "deny_delete": false, 105 | "deny_purge": false, 106 | "allow_rollup_hdrs": false, 107 | "allow_direct": false, 108 | "mirror_direct": false 109 | } 110 | ``` 111 | 112 | A stream that sources from the `sourcedstream` stream twice, each time using a single subject filter and transform: 113 | 114 | ```JSON 115 | { 116 | "name": "sourcingstream", 117 | "retention": "limits", 118 | "max_consumers": -1, 119 | "max_msgs_per_subject": -1, 120 | "max_msgs": -1, 121 | "max_bytes": -1, 122 | "max_age": 0, 123 | "max_msg_size": -1, 124 | "storage": "file", 125 | "discard": "old", 126 | "num_replicas": 1, 127 | "duplicate_window": 120000000000, 128 | "sources": [ 129 | { 130 | "name": "sourcedstream", 131 | "subject_transforms": [ 132 | { 133 | "src": "foo", 134 | "dest": "foo-transformed" 135 | } 136 | ] 137 | }, 138 | { 139 | "name": "sourcedstream", 140 | "subject_transforms": [ 141 | { 142 | "src": "bar", 143 | "dest": "bar-transformed" 144 | } 145 | ] 146 | } 147 | ], 148 | "sealed": false, 149 | "deny_delete": false, 150 | "deny_purge": false, 151 | "allow_rollup_hdrs": false, 152 | "allow_direct": false, 153 | "mirror_direct": false 154 | } 155 | ``` 156 | 157 | A Stream that sources from 2 streams and has a subject transform: 158 | 159 | ```JSON 160 | { 161 | "name": "foo", 162 | "retention": "limits", 163 | "max_consumers": -1, 164 | "max_msgs_per_subject": -1, 165 | "max_msgs": -1, 166 | "max_bytes": -1, 167 | "max_age": 0, 168 | "max_msg_size": -1, 169 | "storage": "file", 170 | "discard": "old", 171 | "num_replicas": 1, 172 | "duplicate_window": 120000000000, 173 | "sources": [ 174 | { 175 | "name": "source1", 176 | "filter_subject": "stream1.foo.>" 177 | }, 178 | { 179 | "name": "source1", 180 | "filter_subject": "stream1.bar.>" 181 | }, 182 | { 183 | "name": "source2", 184 | "subject_transforms": [ 185 | { 186 | "src": "stream2.foo.>", 187 | "dest": "foo2.>" 188 | }, 189 | { 190 | "src": "stream2.bar.>", 191 | "dest": "bar2.>" 192 | } 193 | ] 194 | } 195 | ], 196 | "subject_transform": { 197 | "src": "foo.>", 198 | "dest": "mapped.foo.>" 199 | }, 200 | "sealed": false, 201 | "deny_delete": false, 202 | "deny_purge": false, 203 | "allow_rollup_hdrs": false, 204 | "allow_direct": false, 205 | "mirror_direct": false 206 | } 207 | ``` 208 | ## Client implementation PRs 209 | 210 | - [jsm.go](https://github.com/nats-io/jsm.go/pull/436) [and](https://github.com/nats-io/jsm.go/pull/461) 211 | - [nats.go](https://github.com/nats-io/nats.go/pull/1200) [and](https://github.com/nats-io/nats.go/pull/1359) 212 | - [natscli](https://github.com/nats-io/natscli/pull/695) [and](https://github.com/nats-io/natscli/pull/845) -------------------------------------------------------------------------------- /adr/ADR-44.md: -------------------------------------------------------------------------------- 1 | # Versioning for JetStream Assets 2 | 3 | | Metadata | Value | 4 | |----------|-------------------------------| 5 | | Date | 2024-07-22 | 6 | | Author | @ripienaar | 7 | | Status | Implemented | 8 | | Tags | jetstream, server, 2.11, 2.12 | 9 | 10 | | Revision | Date | Author | Info | Server Requirement | 11 | |----------|------------|-----------------|-----------------------------------------|--------------------| 12 | | 1 | 2024-07-22 | @ripienaar | Initial design | | 13 | | 2 | 2025-08-05 | @MauriceVanVeen | Add required feature level in API calls | 2.12.0 | 14 | | 3 | 2025-08-11 | @ripienaar | Mark as fully implemented | 2.12.0 | 15 | 16 | # Context and Problem Statement 17 | 18 | As development of the JetStream feature progress there is a complex relationship between connected-server, JetStream 19 | meta-leader and JetStream asset host versions that requires careful consideration to maintain compatibility. 20 | 21 | * The server a client connects to might be a Leafnode on a significantly older release and might not even have 22 | JetStream enabled 23 | * The meta-leader could be on a version that does not support a feature and so would not forward the API fields it 24 | is unaware of to the assigned servers 25 | * The server hosting an asset might not match the meta-leader and so can't honor the request for a certain 26 | configuration and would silently drop fields 27 | 28 | In general our stance is to insist on homogenous cluster versions in the general case but it's not reasonable to expect this 29 | for leafnode servers nor is it possible during upgrading and other maintenance windows. 30 | 31 | Our current approach is to check the connected-server version and project that it's representitive of the cluster as 32 | a whole but this is a known incorrect approach especially for Leafnodes as mentioned above. 33 | 34 | We have considered approaches like fully versioning the API but this is unlikely to work for our case or be accepted 35 | by the team. Versioning the API would anyway not alleviate many of the problems encountered when upgrading and 36 | downgrading servers hosting long running assets. As a result we are evaluating some more unconventional approaches 37 | that should still improve our overall stance. 38 | 39 | This ADR specifies a way for the servers to expose some versioning information to help clients improve the 40 | compatability story. 41 | 42 | # Solution Overview 43 | 44 | In lieu of enabling fine grained API versioning we want to start thinking about asset versioning instead, the server 45 | should report it's properties when creating an asset and it should report similar properties when hosting an asset. 46 | 47 | Fine grained API versioning would not really fix the entire problem as our long-lived assets would have to be upgraded 48 | over time between API versions as they get updated with new features. 49 | 50 | So we attempt to address both classes of problem here by utilizing Metadata and a few new server features and concepts. 51 | 52 | # API Support Level 53 | 54 | The first concept we wish to introduce is the concept of a number that indicates the API level of the servers 55 | JetStream support. 56 | 57 | | Level | Versions | 58 | |-------|----------| 59 | | 0 | < 2.11.0 | 60 | | 1 | 2.11.x | 61 | | 2 | 2.12.x | 62 | 63 | While here it's shown incrementing at the major boundaries it's not strictly required, if we were to introduce a 64 | critical new feature mid 2.11 that could cause a bump in support level mid release without it being an issue - we 65 | do not require strict SemVer adherence. 66 | 67 | The server will calculate this for a Stream and Consumer configuration. Here is example for the 2.11.x. It's not 68 | anticipated we would have to keep supporting versions for every asset for ever, it should be sufficient to support 69 | the most recent ones corresponding to the actively supported server versions. 70 | 71 | ```golang 72 | func (s *Server) setConsumerAssetVersionMetadata(cfg *ConsumerConfig, create bool) { 73 | if cfg.Metadata == nil { 74 | cfg.Metadata = make(map[string]string) 75 | } 76 | 77 | if create { 78 | cfg.Metadata[JSCreatedVersionMetadataKey] = VERSION 79 | cfg.Metadata[JSCreatedLevelMetadataKey] = JSFeatureLevel 80 | } 81 | 82 | featureLevel := "0" 83 | 84 | // Added in 2.11, absent | zero is the feature is not used. 85 | // one could be stricter and say even if its set but the time 86 | // has already passed it is also not needed to restore the consumer 87 | if cfg.PauseUntil != nil && !cfg.PauseUntil.IsZero() { 88 | featureLevel = "1" 89 | } 90 | 91 | cfg.Metadata[JSRequiredFeatureMetadataKey] = featureLevel 92 | } 93 | ``` 94 | 95 | In this way we know per asset what server feature set it requires and only the server need this logic vs all the 96 | clients if the client had to assert `needs a server of at least level 1`. 97 | 98 | We have to handle updates to asset configuration since an update might use a feature only found in newer servers. 99 | 100 | Servers would advertise their supported API level in `jsz`, `varz` and `$JS.API.INFO`. It should also be logged 101 | at JetStream startup. 102 | 103 | ## When to increment API Level 104 | 105 | Generally when adding any new feature/field to the `StreamConfig` or `ConsumerConfig`. Especially when the field was set 106 | by a user and the asset should be loaded in offline mode when the feature is not supported by the server. 107 | 108 | If a new feature is added, the API level only needs to be incremented if another feature planned for the same release 109 | didn't already increment the API level. Meaning if multiple new features are added within the same release cycle, the 110 | API level only needs to be incremented once and not for every new feature. 111 | 112 | 113 | # Server-set metadata 114 | 115 | We'll store current server and asset related information in the existing `metadata` field allowing us to expand this in 116 | time, today we propose the following: 117 | 118 | | Name | Description | 119 | |-------------------|-----------------------------------------------| 120 | | `_nats.ver` | The current server version hosting an asset | 121 | | `_nats.level` | The current server API level hosting an asset | 122 | | `_nats.req.level` | The required API level to start an asset | 123 | 124 | We intend to store some client hints in here to help us track what client language and version created assets. 125 | 126 | As some of these these are dynamic fields tools like Terraform and NACK will need to understand to ignore these fields 127 | when doing their remediation loops. 128 | 129 | # Offline Assets 130 | 131 | Today when an asset cannot be loaded it's simply not loaded. But to improve compatibility, user reporting and 132 | discovery we want to support a mode where a stream is visible in Stream reports but marked as offline with a reason. 133 | 134 | To support this we add a field to the `io.nats.jetstream.api.v1.stream_list_response` and `io.nats.jetstream.api.v1.consumer_list_response` 135 | that holds a map or offline assets and reasons `map[string]string`. 136 | 137 | All offline streams will be added to the existing `missing` list in responses but offline ones will have the reasons in 138 | the additional key. This ensure older tools will still understand these streams as inaccessible. 139 | 140 | The `io.nats.jetstream.api.v1.stream_names_response` and `io.nats.jetstream.api.v1.consumer_names_response` results 141 | should include the offline assets. 142 | 143 | For starting incompatible streams in offline mode we would need to load the config in the current manner to figure out 144 | which subjects Streams would listen on since even while Streams are offline we do need the protections of 145 | overlapping subjects to be active to avoid issues later when the Stream can come online again. 146 | 147 | # Safe unmarshalling of JSON data 148 | 149 | The JetStream API and Meta-layer should start using the `DisallowUnknownFields` feature in the go json package and 150 | detect when asked to load incompatible assets or serve incompatible API calls and should error in the case of the 151 | API and start assets in Offline mode in the case of assets. 152 | 153 | This will prevent assets inadvertently reverting some settings and changing behaviour during downgrades. 154 | 155 | A POC branch against 2.11 main identified only 1 test failure after changing all JSON Unmarshall calls and this was a 156 | legit bug in a test. 157 | 158 | One possible approach that can be introduced in 2.11 is to already perform strict unmarshalling in all cases but when a 159 | issue is detected we would log the error and then do a normal unmarshal to remain compatible. A configuration option 160 | should exist to turn this into a fatal error rather than a logged warning only. 161 | 162 | It could also be desirable to allow a header in the API requests to signal strict unmarshalling should be fatal for a 163 | specific API call in order to facilitate testing. 164 | 165 | # Minimal supported API level for assets 166 | 167 | When the server loads assets it should detect incompatible features using a combination of `DisallowUnknownFields` 168 | and comparing the server API levels to those required by the asset. 169 | 170 | Incompatible assets should be loaded in Offline mode and an advisory should be published. 171 | 172 | # Required API level in API Calls 173 | 174 | Clients can assert that a certain API call requires a minimum API Level by including a `Nats-Required-Api-Level` header, 175 | containing the requested minimum API level as a string. All endpoints under the JetStream API (`$JS.API.>`) should be 176 | supported. 177 | 178 | This asserts whether the server that's responding supports this API level. The request will error with 179 | `api level not supported` if the server has a lower API level than required. 180 | 181 | A single server does not represent a cluster-wide supported API level. In the future we should keep track of a 182 | cluster-agreed API level, and have the servers enforce an agreed-upon API level. 183 | -------------------------------------------------------------------------------- /adr/ADR-13.md: -------------------------------------------------------------------------------- 1 | # Pull Subscribe internals 2 | 3 | | Metadata | Value | 4 | |----------|---------------------------| 5 | | Date | 2021-07-20 | 6 | | Author | wallyqs | 7 | | Status | Partially Implemented | 8 | | Tags | jetstream, client | 9 | 10 | ## Motivation 11 | 12 | One of the forms of message delivery in JetStream is a pull based 13 | consumers. In this ADR it is described what is the current state of 14 | implementing consumption of messages for this type of consumers. 15 | 16 | ## Context 17 | 18 | A pull based consumer is a type of consumer that does not have a 19 | `delivery subject`, that is the server _does not know_ where to send 20 | the messages. Instead, the clients have to request for the messages 21 | to be delivered as needed from the server. For example, given a stream `bar` 22 | with the consumer with a durable name `dur` (all pull subscribers have to be 23 | durable), a pull request would look like this in terms of the 24 | protocol: 25 | 26 | ```shell 27 | PUB $JS.API.CONSUMER.MSG.NEXT.bar.dur _INBOX.x7tkDPDLCOEknrfB4RH1V7.UBZe2D 0 28 | ``` 29 | 30 | ### Request Body 31 | 32 | There are 3 possible fields that can be presented in the json body of the request. 33 | If there is no body presented (as in the example above, all defaults are assumed.) 34 | 35 | #### batch 36 | The number of messages that the server should send. Minimum is 1. 37 | There is not a limit on the batch size at the level of the library. 38 | The account and then server may have some limits. 39 | 40 | #### no_wait 41 | 42 | A boolean value indicating to make this pull request the no wait type. See below for details. Default is false. 43 | 44 | #### expires 45 | 46 | The number of nanoseconds, from now that this pull will expire. <= 0 or not supplied means the expiration is not applied. 47 | No wait takes precedence over expires if both are supplied. 48 | 49 | ### Pull(n) Requests 50 | 51 | A request with an empty payload is identical to a pull with batch size of 1 and 52 | results in the server sending the next (1) available message, for example: 53 | 54 | ```shell 55 | SUB _INBOX.example 0 56 | +OK 57 | PUB $JS.API.CONSUMER.MSG.NEXT.bar.dur _INBOX.example 0 58 | +OK 59 | MSG bar 0 $JS.ACK.bar.dur.1.9808.9808.1626818873482533000.0 4 60 | helo 61 | ``` 62 | 63 | Note that even though the inbox used for the request was 64 | `_INBOX.example`, when the message got delivered the subject was 65 | rewritten into `bar` which is the subject of the message that is 66 | persisted in JetStream. 67 | 68 | A pull request for the next message, will linger until there is no 69 | more interest in the subject, a client is disconnected or the batch size is filled. 70 | Each pull request will increase the `num_awaiting` counter for a consumer of 71 | inflight pull requests [2]. At most, a consumer can only have 512 72 | inflight pull requests, though this can be changed when creating the 73 | consumer with the `max_waiting` option [1]: 74 | 75 | ```shell 76 | PUB $JS.API.CONSUMER.INFO.bar.dur _INBOX.uMfJLECClHCs0CLfWF7Rsj.ds2ZxC4o 0 77 | MSG _INBOX.uMfJLECClHCs0CLfWF7Rsj.ds2ZxC4o 1 601 78 | { 79 | "type": "io.nats.jetstream.api.v1.consumer_info_response", 80 | "stream_name": "bar", 81 | "name": "dur", 82 | "created": "2021-07-20T21:35:11.825973Z", 83 | "config": { 84 | "durable_name": "dur", 85 | "deliver_policy": "all", 86 | "ack_policy": "explicit", 87 | "ack_wait": 30000000000, 88 | "max_deliver": -1, 89 | "filter_subject": "bar", 90 | "replay_policy": "instant", 91 | "max_waiting": 512, <-- Maximum Inflight Pull/Fetch Requests [1] 92 | "max_ack_pending": 20000 93 | }, 94 | "delivered": { 95 | "consumer_seq": 11561, 96 | "stream_seq": 11560 97 | }, 98 | "ack_floor": { 99 | "consumer_seq": 11561, 100 | "stream_seq": 11560 101 | }, 102 | "num_ack_pending": 0, 103 | "num_redelivered": 0, 104 | "num_waiting": 1, <-- Inflight Pull/Fetch Requests [2] 105 | "num_pending": 0, 106 | "cluster": { 107 | "leader": "NCUOUH6YICRESE73CKTXESMGA4ZN4KTELXPUIE6JCRTL6IF4UWE3B2Z4" 108 | } 109 | } 110 | ``` 111 | 112 | When making a pull request it is also possible to request more than one message: 113 | 114 | ```shell 115 | PUB $JS.API.CONSUMER.MSG.NEXT.bar.dur _INBOX.x7tkDPDLCOEknrfB4RH1V7.OgY4M7 32 116 | {"batch":5,"expires":4990000000} 117 | ``` 118 | 119 | Whenever a pull request times out, the count of `num_waiting` will increase for a consumer 120 | but this will eventually reset once it reaches the max waiting inflight that was configured 121 | for the pull consumer. 122 | 123 | ### No Wait Pull Requests 124 | 125 | In order to get a response from the server right away, a client can 126 | make a pull request with the `no_wait` option enabled. For example: 127 | 128 | ```shell 129 | PUB $JS.API.CONSUMER.MSG.NEXT.bar.dur _INBOX.x7tkDPDLCOEknrfB4RH1V7.OgY4M7 26 130 | {"batch":1,"no_wait":true} 131 | ``` 132 | 133 | The result of a no wait pull request is a guaranteed instant response 134 | from the server that will be either the next message or an error, 135 | where an error could be a server error such as `503` in case JS 136 | service is not available. Most commonly, the result of a no wait pull 137 | request will be a `404` no messages error: 138 | 139 | ```shell 140 | HMSG _INBOX.x7tkDPDLCOEknrfB4RH1V7.OgY4M7 2 28 28 141 | NATS/1.0 404 No Messages 142 | ``` 143 | 144 | ## Design 145 | 146 | The implementation for pull subscribe uses a combination of both no wait and 147 | lingering pull requests described previously. 148 | 149 | In the Go client, a simple example of the API looks as follows: 150 | 151 | ```go 152 | sub, err := js.PullSubscribe("stream-name", "durable") 153 | if err != nil { 154 | log.Fatal(err) 155 | } 156 | 157 | for { 158 | msgs, err := sub.Fetch(1) 159 | if err != nil { 160 | log.Println("Error:", err) 161 | continue 162 | } 163 | for _, msg := range msgs { 164 | msg.Ack() 165 | } 166 | } 167 | ``` 168 | 169 | When implementing `PullSubscribe` there are two main cases to 170 | consider: `Pull(n)` and `Pull(1)`. 171 | 172 | #### Pull(n) 173 | 174 | `Pull(n)` batch requests are implemented somewhat similarly to old style 175 | requests. When making a pull request, first a no wait request is 176 | done to try to get the messages that may already be available as 177 | needed. If there are no messages (a 404 status message error by the 178 | JetStream server), then a long pull request is made: 179 | 180 | ```shell 181 | SUB _INBOX.NQ2uAIXd4GoozKOTfECtIg 1 182 | PUB $JS.API.CONSUMER.MSG.NEXT.bar.dur _INBOX.WvaJLnIXcj8Zf5SrxlHMTS 26 183 | {"batch":5,"no_wait":true} 184 | 185 | # Result of no wait request if there are no messages 186 | HMSG _INBOX.WvaJLnIXcj8Zf5SrxlHMTS 8 28 28 187 | NATS/1.0 404 No Messages 188 | 189 | # Next pull request is a long pull request with client side timeout 190 | PUB $JS.API.CONSUMER.MSG.NEXT.bar.dur _INBOX.NQ2uAIXd4GoozKOTfECtIg 32 191 | {"expires":4990000000,"batch":5} 192 | ``` 193 | 194 | As part of the request payload, the `batch` field is set to the number 195 | of expected messages and `expires` is set to cancel the request 196 | `100ms` before the client side timeout. In the example above, the 197 | timeout is `5s` so in the payload `expires` is the result of `5s - 198 | 100ms` represented in nanoseconds. 199 | 200 | After making the first request no wait request, it is also recommended 201 | to send an auto unsubscribe protocol discounting the message already 202 | received as a result of the no wait request. In case the batch 203 | request was for 5 messages, the client would auto unsubscribe after 204 | receiving 6. Since first message was an error the `UNSUB` is (batch+1) 205 | to account for that initial error status message. 206 | 207 | ```shell 208 | UNSUB 1 6 209 | ``` 210 | 211 | When successful, the result of a batch request will be at least one 212 | message being delivered to the client. In case not all messages are 213 | delivered and the client times out or goes away during the pull 214 | request, it is recommended to unsubscribe to remove the interest of 215 | the awaited messages from the server, otherwise this risks the server 216 | sending messages to an inbox from a connected client that is no longer 217 | expecting messages. 218 | 219 | ```shell 220 | SUB _INBOX.WvaJLnIXcj8Zf5SrxlHMTS 1 221 | PUB $JS.API.CONSUMER.MSG.NEXT.bar.dur _INBOX.WvaJLnIXcj8Zf5SrxlHMTS 26 222 | {"batch":5,"no_wait":true} 223 | HMSG _INBOX.WvaJLnIXcj8Zf5SrxlHMTS 8 28 28 224 | NATS/1.0 404 No Messages 225 | UNSUB 1 6 226 | PUB $JS.API.CONSUMER.MSG.NEXT.bar.dur _INBOX.WvaJLnIXcj8Zf5SrxlHMTS 32 227 | {"expires":4990000000,"batch":5} 228 | MSG hello 1 $JS.ACK.bar.dur.2.29034.29041.1626845015078897000.0 4 229 | helo 230 | # Only 1 out of 5 messages receives, so client goes away and unsubscribes. 231 | UNSUB 1 232 | ``` 233 | 234 | ##### Errors and Status messages handling 235 | 236 | While receiving messages, at any point the client may instead 237 | receive an error from the server as a status message. 238 | 239 | ```sh 240 | MSG hello 1 $JS.ACK.bar.dur.2.29034.29041.1626845015078897000.0 5 241 | hello 242 | # (System reached too many inflight pull requests condition) 243 | HMSG _INBOX.WvaJLnIXcj8Zf5SrxlHMRI 1 32 32 244 | NATS/1.0 408 Request Timeout 245 | ``` 246 | 247 | Whenever there is a status message, it is recommended that the error 248 | is not returned to the user as a processable message and instead handle 249 | is as the error condition: 250 | 251 | ```go 252 | for { 253 | msgs, err := sub.Fetch(5) 254 | if err != nil { 255 | // Any error such as: 256 | // 257 | // - client timeout 258 | // - request timeout (408) 259 | // - bad request 260 | // - no responders 261 | // - system unavailable (no current JS quorum) 262 | log.Println("Error:", err) 263 | continue 264 | } 265 | // Msgs here are never status or error messages, 266 | // in case there is an error the loop just breaks. 267 | for _, msg := range msgs { 268 | msg.Ack() 269 | } 270 | } 271 | ``` 272 | 273 | #### Pull Optimization 274 | 275 | For the case of pulling a single message, it is possible to optimize 276 | things by making the first pull request as a no wait request instead 277 | and by preparing a new style like request/response handler using a wildcard 278 | subscription. This will result in less chatty protocol and also works better with 279 | topologies where JetStream is running as a leafnode on the edge. 280 | 281 | ```shell 282 | SUB _INBOX.miOJjN58koGhobmCGCWKJz.* 2 283 | PUB $JS.API.CONSUMER.MSG.NEXT.bar.dur _INBOX.miOJjN58koGhobmCGCWKJz.asdf 26 284 | {"batch":1,"no_wait":true} 285 | ``` 286 | 287 | Similar to `Pull(n)`, when the first no wait request fails, 288 | after the first `Pull(1)` a longer old style request is made with a 289 | unique inbox. 290 | 291 | **Note**: Each pull subscriber must have its own pull request/response handler. 292 | The default implementation of new style request cannot be used for this 293 | purpose due to how the subject gets rewritten which would cause responses to be dropped. 294 | -------------------------------------------------------------------------------- /adr/ADR-51.md: -------------------------------------------------------------------------------- 1 | # JetStream Message Scheduler 2 | 3 | | Metadata | Value | 4 | |----------|-----------------| 5 | | Date | 2025-03-21 | 6 | | Author | @ripienaar | 7 | | Status | Approved | 8 | | Tags | jetstream, 2.12 | 9 | 10 | 11 | | Revision | Date | Author | Info | 12 | |----------|------------|------------|-----------------------------------------| 13 | | 1 | 2025-03-21 | @ripienaar | Document Initial Design | 14 | | 2 | 2025-09-30 | @ripienaar | Use `omitempty` on configuration fields | 15 | 16 | ## Context and Motivation 17 | 18 | It's a frequently requested feature to allow messages to be delivered on a schedule or to support delayed publishing. 19 | 20 | We propose here a feature where 1 message contains a Cron-like schedule and new messages are produced, into the same stream, on the schedule. In all cases the last message on a subject holds the current schedule. In other words every schedule must have its own unique subject. 21 | 22 | We target a few use cases in the initial design: 23 | 24 | * Publish a message at a later time 25 | * Regularly publish a message on a schedule 26 | * Publish the latest message for a subject on a schedule, to be used for data sampling 27 | 28 | ## Single scheduled message 29 | 30 | In this use case the Stream will essentially hold onto a message and publish it again at a later time. Once published the held message is removed. 31 | 32 | ```bash 33 | $ nats pub -J 'schedules.orders.single' \ 34 | -H "Nats-Schedule: @at 2009-11-10T23:00:00Z" \ 35 | -H "Nats-Schedule-TTL: 5m" \ 36 | -H "Nats-Schedule-Target: orders" 37 | body 38 | ``` 39 | 40 | This message will be published near the supplied timestamp, the `Nats-Schedule-Target` must be a subject in the same stream and the published message could be republished using Stream Republish configuration. Additional headers added to the message will be sent to the target subject verbatim. 41 | 42 | If a message is made with a schedule in the past it is immediately sent. If a server was down for a month and a scheduled message is recovered, even if it was schedule for a month ago, it will be sent immediately. To avoid this, add a `Nats-TTL` header to the message so it will be removed after the TTL. 43 | 44 | Messages produced from this kind of schedule will have a `Nats-Schedule-Next` header set with the value `purge` 45 | 46 | The generated message has a Message TTL of `5m`. 47 | 48 | The time format is RFC3339 and may include a timezone which the server will convert to UTC when received and execute according to UTC time later. 49 | 50 | There may only be one message per subject that holds a schedule, if a user wishes to have many delayed messages all publishing into the same subject the scheduled messages need to go into something like `orders.schedule.UUID` where UUID is a unique identifier, set the `Nats-Schedule-Target` to the desired target subject. 51 | 52 | ## Cron-like schedules 53 | 54 | In this use case the Stream holds a message with a Cron-like schedule attached to it and the Stream will produce messages on the given schedule. 55 | 56 | ```bash 57 | $ nats pub -J 'schedules.orders.hourly' \ 58 | -H "Nats-Schedule: @hourly" \ 59 | -H "Nats-Schedule-TTL: 5m" \ 60 | -H "Nats-Schedule-Target: orders" 61 | body 62 | ``` 63 | 64 | In this case a new message will be placed in `orders` holding the supplied body unchanged.The original schedule message will remain and again produce a message the next hour. Additional headers added to the message will be sent to the target subject verbatim. If the original schedule message has a `Nats-TTL` header the schedule will be removed after that time. 65 | 66 | The generated message has a Message TTL of `5m`. 67 | 68 | Execution times will be in UTC regardless of server local time zone. 69 | 70 | There may only be one message per subject that holds a schedule, if a user wishes to have many scheduled messages all publishing into the same subject the scheduled messages need to go into something like `orders.cron.UUID` where UUID is a unique identifier, set the `Nats-Schedule-Target` to the desired target subject. 71 | 72 | ### Schedule Format 73 | 74 | #### 6 field crontab format 75 | 76 | Valid schedule header can match normal cron behavior with a few additional conveniences. 77 | 78 | | Field Name | Allowed Values | 79 | |--------------|-------------------------------| 80 | | Seconds | 0-59 | 81 | | Minutes | 0-59 | 82 | | Hours | 0-23 | 83 | | Day of Month | 1-31 | 84 | | Month | 1-12, or names | 85 | | Day of Week | 0-6, or names, 0 means Sunday | 86 | 87 | (Note this is largely copied from `crontab(5)` man page) 88 | 89 | A field may contain an asterisk (*), which always stands for "first-last". See `Step Values` for interaction with the `/` special character. 90 | 91 | Ranges of numbers are allowed. For example, 8-11 for an 'hours' entry specifies execution at hours 8, 9, 10, and 11. The first number must be less than or equal to the second one. 92 | 93 | Lists are allowed. A list is a set of numbers (or ranges) separated by commas. Examples: `1,2,5,9`, `0-4,8-12`. 94 | 95 | Step values can be used in conjunction with ranges. Following a range with "/" specifies skips of the number's value through the range. For example, `0-23/2` can be used in the 'hours' field to specify command execution for every other hour. Step values are also per‐mitted after an asterisk, so if specifying a job to be run every two hours, you can use `*/2`. 96 | 97 | Names can also be used for the 'month' and 'day of week' fields. Use the first three letters of the particular day or month (case does not matter). Ranges and list of names are allowed. Examples: `mon,wed,fri`, `jan-mar`. 98 | 99 | Note: The day of a command's execution can be specified in the following two fields — 'day of month', and 'day of week'. If both fields are restricted (i.e., do not contain the `*` character), the command will be run when either field matches the current time. For example, `30 4 1,15 * 5` would cause a command to be run at 4:30 am on the 1st and 15th of each month, plus every Friday. 100 | 101 | #### Predefined Schedules 102 | 103 | A number of predefined schedules exist, they can be used them like `Nats-Schedule: @hourly`. 104 | 105 | | Entry | Description | Cron Format | 106 | |----------------------------|--------------------------------------------|---------------| 107 | | `@yearly` (or `@annually`) | Run once a year, midnight, Jan. 1st | `0 0 0 1 1 *` | 108 | | `@monthly` | Run once a month, midnight, first of month | `0 0 0 1 * *` | 109 | | `@weekly` | Run once a week, midnight between Sat/Sun | `0 0 0 * * 0` | 110 | | `@daily` (or `@midnight`) | Run once a day, midnight | `0 0 0 * * *` | 111 | | `@hourly` | Run once an hour, beginning of hour | `0 0 * * * *` | 112 | 113 | #### Intervals 114 | 115 | You may also schedule a job to execute at fixed intervals, starting at the time it's added or cron is run. This is supported by formatting the cron spec like this: 116 | 117 | `@every 1m` 118 | 119 | The time specification complies with go `time.ParseDuration()` format. 120 | 121 | ## Subject Sampling 122 | 123 | In this use case we could have a sensor that produce a high frequency of data into a Stream subject in a Leafnode. We might have realtime processing happening in the site where the data is produced but externally we only want to sample the data every 5 minutes. 124 | 125 | ```bash 126 | $ nats pub -J 'schedules.sensors.cnc_temperature_sampled' \ 127 | -H "Nats-Schedule: @every 5m" \ 128 | -H "Nats-Schedule-Source: sensors.cnc.temperature 129 | -H "Nats-Schedule-Target: sensors.sampled.cnc.temperature" 130 | "" 131 | ``` 132 | 133 | Here the local site would produce high frequency temperature readings into `sensors.cnc.temperature` but we publish the latest sensor value every 5 minutes into an aggregate subject. 134 | 135 | ## Headers 136 | 137 | These headers can be set on message that define a schedule: 138 | 139 | | Header | Description | 140 | |------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------| 141 | | `Nats-Schedule` | The schedule the message will be published on | 142 | | `Nats-Schedule-Target` | The subject the message will be delivered to | 143 | | `Nats-Schedule-Source` | Instructs the schedule to read the last message on the given subject and publish it. If the Subject is empty, nothing is published, wildcards are not supported | 144 | | `Nats-Schedule-TTL` | When publishing sets a TTL on the message if the stream supports per message TTLs | | 145 | 146 | Messages that the Schedules produce will have these headers set in addition to any other headers on that was found in the message. 147 | 148 | | Header | Description | 149 | |----------------------|------------------------------------------------------------------------------------------| 150 | | `Nats-Scheduler` | The subject holding the schedule | 151 | | `Nats-Schedule-Next` | Timestamp for next invocation for cron schedule messages or `purge` for delayed messages | 152 | | `Nats-TTL` | `5m` when `Nats-Schedule-TTL` is given | 153 | 154 | The body of the message will simply be the provided body in the schedule. 155 | 156 | Valid schedule header can match normal cron behavior as defined earlier 157 | 158 | All time calculations will be done in UTC, a Cron schedule like `* 0 5 * * *` means exactly 5AM UTC. 159 | 160 | ## Stream Configuration 161 | 162 | #### Creating the stream. 163 | 164 | The `AllowMsgSchedules` field is new, added specifically for this feature and must be set to true for the feature to be enabled. 165 | 166 | ```go 167 | type StreamConfig struct { 168 | // AllowMsgSchedules enables the feature 169 | AllowMsgSchedules bool `json:"allow_msg_schedules,omitempty"` 170 | } 171 | ``` 172 | * If the user intends to use the `Nats-Schedule-TTL` feature, the `AllowMsgTTL` must be true for the stream. 173 | * Setting this on a Source or Mirror should be denied 174 | * This feature can be enabled on existing streams but not disabled 175 | * A Stream with this feature on should require API level 2 176 | 177 | #### Stream Subjects 178 | As already noted, every schedule must have its own unique subject, so it is recommended that the stream subject contain wild cards to easily allow for many schedules. 179 | For instance adding `schedules.>` as a stream subject would allow for all the example subjects: `schedules.orders.single`, `schedules.orders.hourly` and `schedules.sensors.cnc_temperature_sampled` 180 | 181 | The target subjects just normal subjects like `orders`, `sensors.cnc.temperature` or `sensors.sampled.cnc.temperature` and their pattern must also be added as a stream subject. 182 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /adr/ADR-1.md: -------------------------------------------------------------------------------- 1 | # JetStream JSON API Design 2 | 3 | |Metadata|Value| 4 | |--------|-----| 5 | |Date |2020-04-30| 6 | |Author |@ripienaar| 7 | |Status |Implemented| 8 | |Tags |jetstream, client, server| 9 | 10 | ## Context 11 | 12 | Several JetStream APIs exist focussed on message submission, administration and message retrieval. Where it makes sense 13 | these APIs are based on JSON documents and have Schemas and Schema type identifiers. In some cases this is not possible 14 | or would present a large overhead, so a more traditional text approach is taken for those APIs. 15 | 16 | This document outlines the basic approach - rather than detail of every single API - it should give a good introduction 17 | to those wishing to write new clients or understand the underlying behaviors. 18 | 19 | ## Overview 20 | 21 | The API is built using NATS Request-Reply pattern, a Request to `$JS.API.STREAM.INFO.ORDERS` requests the information for 22 | the `ORDERS` stream, the response a JSON document that has type `io.nats.jetstream.api.v1.stream_info_response`. 23 | 24 | In this case the API has no request input, the server will accept a nil payload or `{}` to indicate an empty request body, 25 | in cases where there is an optional input the optional JSON document can be added when needed. 26 | 27 | In this example we accessed the subject `$JS.API.STREAM.INFO.`, every API has a unique subject and generally 28 | the subjects include tokens indicating the item being accessed. This is to assist in generating ACLs giving people access 29 | to either subsets of API or even down to a single Stream or Consumer. 30 | 31 | Errors are either in the form of JSON documents or a `-ERR` string style, more on this in the dedicated section. 32 | 33 | One can observe the API in action using the `nats` CLI by adding the `--trace` option, any API interaction to and from 34 | JetStream is then logged showing Subjects and Bodies unmodified. This is an invaluable way to observe the interaction model. 35 | 36 | ## Accessing 37 | 38 | Accessing the API is via the Request-Reply system. 39 | 40 | ```nohighlight 41 | $ nats req '$JS.API.STREAM.NAMES' '{}' 42 | 14:18:14 Sending request on "$JS.API.STREAM.NAMES" 43 | 14:18:14 Received on "_INBOX.vrP0URcbRWXaMrqtFIDAm6.DiACQMKp" rtt 1.036883ms 44 | {"type":"io.nats.jetstream.api.v1.stream_names_response","total":1,"offset":0,"limit":1024,"streams":["ORDERS"]} 45 | ``` 46 | 47 | Here the request in question had an empty payload, the server accepts nil, empty string or `{}` as valid payloads in that case. 48 | 49 | Access to the API is via a unique subject per API, some of the subjects can be seen here, this is not an exhaustive list: 50 | 51 | ```nohighlight 52 | $JS.API.STREAM.CREATE.%s 53 | $JS.API.STREAM.UPDATE.%s 54 | $JS.API.STREAM.NAMES 55 | $JS.API.STREAM.LIST 56 | $JS.API.STREAM.INFO.%s 57 | $JS.API.STREAM.DELETE.%s 58 | $JS.API.STREAM.PURGE.%s 59 | $JS.API.STREAM.MSG.DELETE.%s 60 | $JS.API.STREAM.MSG.GET.%s 61 | $JS.API.STREAM.SNAPSHOT.%s 62 | $JS.API.STREAM.RESTORE.%s 63 | $JS.API.STREAM.PEER.REMOVE.%s 64 | $JS.API.STREAM.LEADER.STEPDOWN.%s 65 | ``` 66 | 67 | As you can see these are all STREAM related, the placeholder would be the name of the stream being accessed. 68 | 69 | ## Paging 70 | 71 | Some APIs like the `$JS.API.STREAM.NAMES` one above is paged, this is indicated by the presence of the `total`, `offset` and `limit` 72 | fields in the reply. 73 | 74 | These APIs take a request parameter `offset` to move through pages, in other words: 75 | 76 | ```nohighlight 77 | $ nats req '$JS.API.STREAM.NAMES' '{"offset": 1024}' 78 | ``` 79 | 80 | Will list the stream names starting at number 1024. 81 | 82 | ## Anatomy of a Response 83 | 84 | Generally we have a few forms of response with some representative ones shown here: 85 | 86 | ### Good response 87 | 88 | ```json 89 | { 90 | "type": "io.nats.jetstream.api.v1.stream_names_response", 91 | "total": 1, 92 | "offset": 0, 93 | "limit": 1024, 94 | "streams": [ 95 | "KV_NATS" 96 | ] 97 | } 98 | ``` 99 | 100 | This is not an error response (no `error` field), it's paged (has `total`, `offset` and `limit`) and is of the type `io.nats.jetstream.api.v1.stream_names_response` which indicates its schema (see below). 101 | 102 | ### Error Response 103 | 104 | ```json 105 | { 106 | "type": "io.nats.jetstream.api.v1.consumer_info_response", 107 | "error": { 108 | "code": 404, 109 | "err_code": 10059, 110 | "description": "stream not found" 111 | } 112 | } 113 | ``` 114 | 115 | This is a response to a consumer info request, it's type is `io.nats.jetstream.api.v1.consumer_info_response` which indicates its schema (see below). 116 | This is an error response, where the `description` field is variable and can change it's content between server versions. The fields of a healthy response are not shown for an error. 117 | 118 | ## Schemas 119 | 120 | All requests and responses that are in JSON format have JSON Schema Draft 7 documents describing them. 121 | 122 | For example the request to create a stream has the type `io.nats.jetstream.api.v1.stream_create_request` and the response 123 | is a `io.nats.jetstream.api.v1.stream_create_response`. We have additional Schemas that describe some subsets of information 124 | for example `io.nats.jetstream.api.v1.stream_configuration` describes a valid Stream Configuration. 125 | 126 | Generally requests do not expect the kind - it's inferred from which subject is accessed - but replies all have type hints. 127 | 128 | The `nats` CLI can list, view and validate documents against these schemas: 129 | 130 | ```nohighlight 131 | $ nats schema list stream_create 132 | Matched Schemas: 133 | 134 | io.nats.jetstream.api.v1.stream_create_request 135 | io.nats.jetstream.api.v1.stream_create_response 136 | ``` 137 | 138 | The document can be viewed, pass `--yaml` to view it in YAML format that's easier for reading: 139 | 140 | ```nohighlight 141 | $ nats schema show io.nats.jetstream.api.v1.stream_create_request 142 | { 143 | "$schema": "http://json-schema.org/draft-07/schema#", 144 | "$id": "https://nats.io/schemas/jetstream/api/v1/stream_create_request.json", 145 | "description": "A request to the JetStream $JS.API.STREAM.CREATE API", 146 | ... 147 | ``` 148 | 149 | Finally, if you're developing an agent it could be useful to capture the JSON documents and validate them, the CLI can 150 | help with that: 151 | 152 | ```nohighlight 153 | $ nats schema validate io.nats.jetstream.api.v1.stream_create_request x.json 154 | Validation errors in x.json: 155 | 156 | (root): max_consumers is required 157 | (root): max_msgs is required 158 | ``` 159 | 160 | Here we validated the `x.json` file and found a number of errors. 161 | 162 | If you're writing a client and have JSON Schema validators to hand you can access the schemas in our [schema repository](https://github.com/nats-io/jsm.go/tree/main/schemas) 163 | 164 | ## Data Types 165 | 166 | The JetStream API has many data types, generally the JSON Schema tries to point this out, though we have some gaps in the data for sure. 167 | This section explores a few of the types in details. 168 | 169 | ### Sequence related unsigned 64bit integers 170 | 171 | JetStream can store up to the limit of an unsigned 64bit integer messages, this means status data, start points and more all have these kinds of unsigned 64 bit data types in them. 172 | 173 | The type is particularly problematic near the top end of the scale since it exceeds what is possible with JSON, languages might need to have custom 174 | JSON parsers to handle this data. Though in practise this will only become a problem after many many years of creating data at full theoretical limit 175 | of message ingest. 176 | 177 | Today all these fields are flagged in the schema, here's an example of one: 178 | 179 | ```json 180 | "opt_start_seq": { 181 | "minimum": 0, 182 | "$comment": "unsigned 64 bit integer", 183 | "type": "integer", 184 | "maximum": 18446744073709551615 185 | } 186 | ``` 187 | 188 | ### Variable size integers 189 | 190 | Some of the fields are limited by the architecture of the server - for example in practise on a 32bit system the number of consumers are limited to 32bit unsigned int, on a 64bit server it would be double. 191 | 192 | Some fields are thus going to be dynamically capped to the server architecture, in practise you have to assume they are 64bit integer. These are in the schema with an example: 193 | 194 | ```json 195 | "max_deliver": { 196 | "description": "The number of times a message will be redelivered to consumers if not acknowledged in time", 197 | "$comment": "integer with a dynamic bit size depending on the platform the cluster runs on, can be up to 64bit", 198 | "type": "integer", 199 | "maximum": 9223372036854775807, 200 | "minimum": -9223372036854775807 201 | } 202 | ``` 203 | 204 | Note while this is an unsigned integer, schema documents may list a different minimum than the minimum for the type. 205 | 206 | ### 32bit Integers 207 | 208 | Similar to the 64bit integers, we just have a 32bit one, care should be taken not to overflow this number when sending data to the server, here's an example. 209 | 210 | ```json 211 | "max_msg_size": { 212 | "description": "The largest message that will be accepted by the Stream. -1 for unlimited.", 213 | "minimum": -1, 214 | "default": -1, 215 | "$comment": "signed 32 bit integer", 216 | "type": "integer", 217 | "maximum": 2147483647 218 | } 219 | ``` 220 | 221 | Note while this is an unsigned integer, schema documents may list a different minimum than the minimum for the type. Here the minimum is `-1`. 222 | 223 | ### Time Durations 224 | 225 | Some fields, like the maximum age of a message, are expressed as durations like `1 day`. 226 | 227 | When sending this to the server you have to do it as a nanoseconds. 228 | 229 | Here's a helper to turn some go durations into nanoseconds and confirm your calculations - [play tool](https://play.golang.org/p/iTkk74ZtiDT). 230 | Here change the `300h` into your time, this tool supports `1ms`, `1m`, `1h` and will help you turn those times into nanoseconds. 231 | 232 | ### Time stamps 233 | 234 | Specific time stamps should usually be expressed as UTC time and when sending times to the API this should be in UTC time also, 235 | but the server is flexible in handling times with zones and may at times give you back times as zones and will also accept them. 236 | 237 | These times are in JSON as quoted strings in RFC 3339 format, with sub-second precision, here are some examples: `2021-07-22T15:42:12.580770412Z`, `2021-07-22T23:48:48.27104904+08:00`. 238 | 239 | ## Error Handling 240 | 241 | ### JSON Based 242 | 243 | A JetStream API error looks like this: 244 | 245 | ```json 246 | { 247 | "type": "io.nats.jetstream.api.v1.consumer_info_response", 248 | "error": { 249 | "code": 404, 250 | "err_code": 10059, 251 | "description": "stream not found" 252 | } 253 | } 254 | ``` 255 | 256 | The `error` key will only be present for error condition requests, the absense of a `error` usually indicates success. 257 | 258 | Here we tried to get the status of a Consumer using a `io.nats.jetstream.api.v1.consumer_info_request` type message and 259 | we got a `404`. As with HTTP `404` means something is not found, it could be the Stream or the Consumer. 260 | 261 | To avoid parsing - and treating as API - the error description we have an additional code `10059`. This is a unique NATS 262 | error: 263 | 264 | ```nohighlight 265 | $ nats error show 10059 266 | NATS Error Code: 10059 267 | 268 | Description: stream not found 269 | HTTP Code: 404 270 | Go Index Constant: JSStreamNotFoundErr 271 | 272 | No further information available 273 | ``` 274 | 275 | Looking at the error code `10059` we can tell that in this case the Stream was not found vs `10014` if the Consumer was 276 | not found. 277 | 278 | The `nats` CLI has tools to view, search, list and more all the error code JetStream produce. These codes are static and 279 | will not change, however more might be added in time. For example today you might get a generic code indicating invalid 280 | configuration but in future you might get a specific code indicating exactly what about your configuration was wrong. 281 | 282 | More details about the Error Codes system can be found in [ADR-7](ADR-7.md). 283 | 284 | ### Text Based 285 | 286 | Some APIs will respond with a text based error message, these will be in the form `-ERR `, these are very uncommon 287 | now in the API and will likely be entirely removed in time. 288 | --------------------------------------------------------------------------------