├── .gitignore ├── 0012-BIRS-distributed-scheduler.md ├── 0013-RS-pubsub-subscription-streaming.md ├── 20221025-P-proposal-process.md ├── 20221121-R-pluggable-components-injector.md ├── 20221130-I-enhance-dapr-run-multiple-apps.md ├── 20230327-RCBS-Crypto-building-block.md ├── 20230406-B-external-service-invocation.md ├── 20230511-BCIRS-error-handling-codes.md ├── 20230627-P-proposal-sdk-approval.md ├── 20230714-S-sdk-resiliency.md ├── 20230918-S-unified-api-token-env-variable.md ├── 20231024-CIR-trust-distribution.md ├── 20240508-S-sidecar-endpoint-tls.md ├── 20240517-R-http-metrics-path-matching.md ├── 20240618-RCBS-Conversation-building-block.md ├── 20240917-BR-resiliency-error-code-retries.md ├── LICENSE ├── README.md ├── guides └── api-design.md ├── resources ├── 0004-BIRS-distributed-scheduler │ ├── bigPicture.png │ ├── pluggableSchedulerService.png │ ├── publicDaprAPI.png │ ├── sidecarToSchedulerComm.png │ └── watchJobsFlow.png ├── 20221130-I-enhance-dapr-run-multiple-apps │ └── interaction-flow-1.png ├── 20230327-RCBS-Crypto-building-block │ └── data-flow.png └── README.md └── templates ├── lifecycle.md └── proposal.md /.gitignore: -------------------------------------------------------------------------------- 1 | /dist 2 | .idea 3 | **/.DS_Store 4 | 5 | github.com/ 6 | 7 | .vscode 8 | 9 | # Visual Studio 2015/2017/2019 cache/options directory 10 | .vs/ 11 | /vendor 12 | **/*.log 13 | **/.project 14 | **/.factorypath 15 | google 16 | 17 | test_report* 18 | 19 | # Go Workspaces (introduced in Go 1.18+) 20 | go.work -------------------------------------------------------------------------------- /0012-BIRS-distributed-scheduler.md: -------------------------------------------------------------------------------- 1 | # Distributed Scheduler Building Block and Service 2 | 3 | * Author(s): 4 | * Cassie Coyle (@cicoyle) 5 | * Yaron Schneider (@yaron2) 6 | * Artur Souza (@artursouza) 7 | * State: Ready for Review 8 | * Updated: 2024-05-28 9 | 10 | ## Overview 11 | 12 | This design proposes 2 additions: 13 | - A Distributed Scheduler API Building Block 14 | - A Distributed Scheduler Control Plane Service 15 | 16 | ## Description 17 | 18 | A distributed scheduler is a system that manages the scheduling and orchestration of jobs across a distributed computing environment at specified times or intervals. 19 | 20 | ## Motivation 21 | 22 | Dapr users have a need for a distributed scheduler. The idea is to have an *orchestrator* for scheduling jobs in the future either at a specific time or a specific interval. 23 | 24 | Examples include: 25 | - Scalable actor reminders 26 | - Scheduling any Dapr API to run at specific times or intervals. For example sending Pub/Sub messages, calling service invocations, input bindings, saving state to a state store. 27 | 28 | ## Goals 29 | 30 | Implement a change into `dapr/dapr` that facilitates a seamless experience allowing for the scheduling of jobs across API building blocks using a new scheduler API building block and control plane service. The Scheduler Building Block is a job orchestrator, not executor. The design guarantees *at least once* job execution with a bias towards durability and horizontal scaling over precision. This means we **can** guarantee that a job will never be invoked *before* the schedule is due, but we **cannot** guarantee a ceiling time on when the job is invoked *after* the due time is reached. 31 | 32 | ## Non-Goals 33 | 34 | - Retry logic 35 | - From the Scheduler Service to the Sidecar. A new 'sidecar' target will be added to available targets such that users can configure the Dapr Resiliency Policies. 36 | - Deep observability into jobs in control plane 37 | - Things beyond basic ListJobs. This might entail things like the history of prior triggered jobs or future jobs. 38 | - Applications to do REST on Jobs in other namespaces. Currently, jobs will be namespaced to the app/sidecar namespace. 39 | 40 | ## Current Shortfalls 41 | 42 | The **Workflows** building block is built on top of Actor Reminders, which have scale limitation issues today. The goal is to improve the performance and scale of Actor Reminder by using the distributed scheduler. 43 | 44 | Currently, Dapr users are able to use the **Publish and Subscribe** building block, but are unable to have delayed PubSub scheduling. This scheduler service enables users to publish a message in a future specific time , for example a week from today or a specific UTC date/time. 45 | 46 | For **Service Invocation**, this building block could also benefit from a scheduler in that it would enable the scheduling of method calls between applications. 47 | 48 | As of now, Dapr does have an **input cron binding** component, which can allow users to schedule tasks. This requires the component yaml file, where users can listen on an endpoint that is scheduled. This is limited to being an input binding only. The Scheduler Service will enable the scheduling of jobs to scale across multiple replicas, while guaranteeing that a job will only be triggered by 1 Scheduler Service instance. 49 | 50 | *Note:* Performance is the primary focus while implementing this feature given the current shortfalls. 51 | 52 | ## Big Picture Idea 53 | 54 | ![Big Picture](./resources/0004-BIRS-distributed-scheduler/bigPicture.png) 55 | 56 | If a user would like to store their user associated data in a specific state store of their choosing, then they can provision a state store using the Dapr State Management Building Block and set `jobStateStore ` as `true` in the state store component’s metadata section. Having the `jobStateStore` set to `true` means that their user associate data will be stored in the state store of their choosing, but their job details will still be stored in the embedded etcd. If the `jobStateStore` is not configured, then the embedded etcd will be used to store both the job details and the user associated data. 57 | 58 | *Note:* The Scheduler functionality is usable by both Standalone (Self-Hosted) and Kubernetes modes. 59 | 60 | ## Implementation Details 61 | 62 | ### Building Block 63 | 64 | #### Scenarios 65 | 66 | ##### Example Usage 67 | 68 | Users will have a job they would like scheduled. For example, an application performs a daily backup of their database. This backup task should run every day at a specific time to ensure data integrity. The user calls to schedule their job 69 | using the new Dapr Scheduler Building Block. 70 | 71 | Example JSON (shown below) that you can use to schedule a job by making a request to `http://localhost:/v1.0/schedule/jobs/prd-db-backup`. This request schedules a job named `prd-db-backup` to run daily for the purpose of performing a database backup. The `@daily` schedule specification indicates that the job will run once a day, specifically at midnight (for more details, refer to the Schedule table below). 72 | 73 | Note: This is an example to illustrate intent. The fields are purposeful for this example, and data can take any form for a job. 74 | ```Json 75 | { 76 | "schedule": "@daily", 77 | "data": { 78 | "task": "db-backup", 79 | "metadata": { 80 | "db_name": "my-prod-db", 81 | "backup_location": "/backup-dir" 82 | } 83 | } 84 | } 85 | ``` 86 | 87 | Potential `dapr/go-sdk` example code: 88 | ```go 89 | import( 90 | ... 91 | schedulerapi "github.com/dapr/dapr/pkg/proto/scheduler/v1" 92 | ... 93 | ) 94 | ... 95 | 96 | type Metadata struct { 97 | DBName string `json:"db_name"` 98 | BackupLocation string `json:"backup_location"` 99 | } 100 | 101 | type DBBackup struct { 102 | Task string `json:"task"` 103 | Metadata Metadata `json:"metadata"` 104 | } 105 | 106 | func main() { 107 | ... 108 | 109 | // Define a job to be scheduled 110 | job := &schedulerapi.Job{ 111 | Name: "prd-db-backup", 112 | Schedule: "@daily", 113 | Data: &ptypes.Any{ 114 | Value: &DBBackup{ 115 | Task: "db-backup", 116 | Metadata: Metadata{ 117 | DBName: "my-prod-db", 118 | BackupLocation: "/backup-dir", 119 | }, 120 | }, 121 | }, 122 | } 123 | 124 | // Schedule a job 125 | scheduleJobRequest := &schedulerapi.ScheduleJobRequest{ 126 | Job: job, 127 | } 128 | 129 | err = client.ScheduleJobAlpha1(context.Background(), scheduleJobRequest) 130 | if err != nil { 131 | fmt.Printf("Error scheduling job: %v\n", err) 132 | } 133 | 134 | // Get a job by name 135 | getJobRequest := &schedulerapi.GetJobRequest{ 136 | Name: "prd-db-backup", 137 | } 138 | 139 | response, err := client.GetJobAlpha1(context.Background(), getJobRequest) 140 | if err != nil { 141 | fmt.Printf("Error getting job: %v\n", err) 142 | } else { 143 | job := response.Job 144 | fmt.Printf("Got job: %v\n", job) 145 | } 146 | 147 | // List all jobs by app_id 148 | listJobsRequest := &schedulerapi.ListJobsRequest{ 149 | AppID: "your-app-id", 150 | } 151 | 152 | // List to be added after 1.14 release 153 | listResponse, err := client.ListJobsAlpha1(context.Background(), listJobsRequest) 154 | if err != nil { 155 | fmt.Printf("Error listing jobs: %v\n", err) 156 | } else { 157 | jobs := listResponse.Jobs 158 | fmt.Printf("List of jobs: %v\n", jobs) 159 | } 160 | 161 | // Delete a job by name 162 | deleteJobRequest := &schedulerapi.DeleteJobRequest{ 163 | Name: "prd-db-backup", 164 | } 165 | 166 | err = client.DeleteJobAlpha1(context.Background(), deleteJobRequest) 167 | if err != nil { 168 | fmt.Printf("Error deleting job: %v\n", err) 169 | } 170 | ... 171 | } 172 | ``` 173 | 174 | ##### Actor Reminders 175 | 176 | The Scheduler Service will be deployed by default. However, for users to use the Scheduler Service for actor reminders, they will need to explicitly opt in via a preview feature. 177 | 178 | The interval functionality of the Actor Reminder is similar to the job schedule. With Actor Reminders, a user can specify: 179 | ```json 180 | { 181 | "dueTime": "10s", 182 | "period": "R4/PT3S", 183 | "ttl": "10s" 184 | } 185 | ``` 186 | 187 | Similar logic can be applied to a job in the following manner: 188 | 189 | - `dueTime` => The time after which the job is invoked. 190 | - `period` => Is baked into the job parameters and can be showcased below where the job will run 4 times (`repeats`) and auto deletes after 10s (`ttl`) 191 | - `repeats` => The job will run up to the number of `repeats` specified, otherwise if unspecified it runs based on the `schedule` provided until deleted via a `ttl` or a user specified deletion via the APIs 192 | 193 | ```json 194 | { 195 | "schedule": "@every 10s", 196 | "dueTime":"10s", 197 | "repeats": 4, 198 | "ttl": "10s" 199 | } 200 | ``` 201 | 202 | The `dueTime` for jobs will follow the same format from Actor Reminders. Supported formats: 203 | ``` 204 | RFC3339 date format, e.g. 2020-10-02T15:00:00Z 205 | time.Duration format, e.g. 2h30m 206 | ISO 8601 duration format, e.g. PT2H30M 207 | ``` 208 | 209 | The `ttl` for jobs will follow the same format from Actor Reminders. Supported formats: 210 | ``` 211 | RFC3339 date format, e.g. 2020-10-02T15:00:00Z 212 | time.Duration format, e.g. 2h30m 213 | ISO 8601 duration format. Example: PT2H30M 214 | ``` 215 | 216 | ##### Schedule 217 | 218 | We will be using [this library](https://github.com/diagridio/go-etcd-cron), and will support the following `schedule` format. 219 | 220 | A cron expression, represents a set of times, using 6 space-separated fields. 221 | 222 | Field name | Mandatory? | Allowed values | Allowed special characters 223 | ---------- | ---------- | -------------- | -------------------------- 224 | Seconds | Yes | 0-59 | * / , - 225 | Minutes | Yes | 0-59 | * / , - 226 | Hours | Yes | 0-23 | * / , - 227 | Day of month | Yes | 1-31 | * / , - ? 228 | Month | Yes | 1-12 or JAN-DEC | * / , - 229 | Day of week | Yes | 0-6 or SUN-SAT | * / , - ? 230 | 231 | A user may use one of several pre-defined schedules in place of a cron expression. 232 | 233 | Entry | Description | Equivalent To 234 | ----- | ----------- | ------------- 235 | @yearly (or @annually) | Run once a year, midnight, Jan. 1st | 0 0 0 1 1 * 236 | @monthly | Run once a month, midnight, first of month | 0 0 0 1 * * 237 | @weekly | Run once a week, midnight on Sunday | 0 0 0 * * 0 238 | @daily (or @midnight) | Run once a day, midnight | 0 0 0 * * * 239 | @hourly | Run once an hour, beginning of hour | 0 0 * * * * 240 | 241 | Examples of how a user's `schedule` may look: 242 | ``` 243 | "0 30 * * * *" 244 | "0 * * 1,15 * Sun" 245 | "@hourly" 246 | "@every 1h30m" 247 | "@daily" 248 | ``` 249 | 250 | #### APIs 251 | 252 | *Note:* For cases where there are multiple instances of an application trying to write the same job name concurrently, we will follow the [last-write-wins concurrency pattern](https://docs.dapr.io/developing-applications/building-blocks/state-management/howto-stateful-service/?_gl=1*1bpqtb2*_ga*MTg0MDc4OTE4NS4xNjkzMjI0NDIw*_ga_60C6Q1ETC1*MTY5ODA3Mzc2OC4xMTkuMS4xNjk4MDc0NTk1LjAuMC4w#first-write-wins-and-last-write-wins), as used in our state-management Building Block and Actor Reminders. 253 | 254 | ##### HTTP 255 | 256 | - Create a scheduled job 257 | - POST 258 | - http://localhost:{daprPort}/v1.0-alpha1/schedule/jobs/{name} 259 | 260 | - Delete a specific job by name 261 | - DELETE 262 | - http://localhost:{daprPort}/v1.0-alpha1/schedule/jobs/{name} 263 | 264 | - Get a specific job by name 265 | - GET 266 | - http://localhost:{daprPort}/v1.0-alpha1/schedule/jobs/{name} 267 | 268 | - List all jobs for an application 269 | - GET 270 | - http://localhost:{daprPort}/v1.0-alpha1/schedule/jobs?appId={app_id} 271 | 272 | ##### gRPC 273 | 274 | ![Public Dapr APIs (User Facing)](./resources/0004-BIRS-distributed-scheduler/publicDaprAPI.png) 275 | 276 | ###### User-Facing APIs 277 | 278 | ```proto 279 | service Dapr { 280 | … 281 | // Create and schedule a job 282 | rpc ScheduleJobAlpha1(ScheduleJobRequest) returns (google.protobuf.Empty) {} 283 | 284 | // Get a scheduled job 285 | rpc GetJobAlpha1(GetJobRequest) returns (GetJobResponse) {} 286 | 287 | // Delete a job 288 | rpc DeleteJobAlpha1(DeleteJobRequest) returns (google.protobuf.Empty) {} 289 | 290 | // List all jobs by app 291 | rpc ListJobsAlpha1(ListJobsRequest) returns (ListJobsResponse) {} 292 | } 293 | 294 | 295 | // Job is the definition of a job. 296 | message Job { 297 | // The unique name for the job. 298 | string name = 1; 299 | 300 | // The schedule for the job. 301 | optional string schedule = 2; 302 | 303 | // Optional: jobs with fixed repeat counts (accounting for Actor Reminders). 304 | optional uint32 repeats = 3; 305 | 306 | // Optional: sets time at which or time interval before the callback is invoked for the first time. 307 | optional string due_time = 4; 308 | 309 | // Optional: Time To Live to allow for auto deletes (accounting for Actor Reminders). 310 | optional string ttl = 5; 311 | 312 | // Job data. 313 | google.protobuf.Any data = 6; 314 | } 315 | 316 | // ScheduleJobRequest is the message to create/schedule the job. 317 | message ScheduleJobRequest { 318 | // The job details. 319 | Job job = 1; 320 | } 321 | 322 | // GetJobRequest is the message to retrieve a job. 323 | message GetJobRequest { 324 | // The name of the job. 325 | string name = 1; 326 | } 327 | 328 | // GetJobResponse is the message's response for a job retrieved. 329 | message GetJobResponse { 330 | // The job details. 331 | Job job = 1; 332 | } 333 | 334 | // DeleteJobRequest is the message to delete the job by name. 335 | message DeleteJobRequest { 336 | // The name of the job. 337 | string name = 1; 338 | } 339 | 340 | // ListJobsRequest is the message to list jobs by app_id. 341 | message ListJobsRequest { 342 | // The id of the application (app_id) for which to list jobs. 343 | string app_id = 1; 344 | } 345 | 346 | // ListJobsResponse is the response message to convey the list of jobs. 347 | message ListJobsResponse { 348 | // List of jobs that match the request criteria. 349 | repeated Job jobs = 1; 350 | } 351 | ``` 352 | 353 | ###### Daprd Sidecar to Scheduler Service APIs 354 | 355 | For the daprd sidecar to Scheduler Service communication, 356 | ![Scheduler APIs (SideCar Facing)](./resources/0004-BIRS-distributed-scheduler/sidecarToSchedulerComm.png) 357 | 358 | We will use the same exact protos from the Public Dapr API, but inside a **new**: `dapr/proto/scheduler/scheduler.proto`. 359 | The Schedule/Get/Delete job(s) will be performed via a unary call to the Scheduler Service. 360 | 361 | There is a bidirectional streaming connection between the daprd sidecar and the Scheduler Service to allow for the acknowledgment of successfully triggered jobs. 362 | 363 | ###### Scheduler Service APIs 364 | 365 | In the **new** `dapr/proto/scheduler/scheduler.proto`, the daprd sidecar upon startup will establish a streaming connection with the Scheduler Service such that at the trigger time for a job the Scheduler Service will send that job to the daprd sidecar which is watching for jobs. Then the daprd sidecar will send the job to the app sending the `WatchJobsResponse` back to the Scheduler. 366 | 367 | ![WatchJobsFLow](./resources/0004-BIRS-distributed-scheduler/watchJobsFlow.png) 368 | 369 | ```proto 370 | service Scheduler { 371 | // ScheduleJob is used by the daprd sidecar to schedule a job. 372 | rpc ScheduleJob(ScheduleJobRequest) returns (ScheduleJobResponse) {} 373 | 374 | // Get a job 375 | rpc GetJob(GetJobRequest) returns (GetJobResponse) {} 376 | 377 | // DeleteJob is used by the daprd sidecar to delete a job. 378 | rpc DeleteJob(DeleteJobRequest) returns (DeleteJobResponse) {} 379 | 380 | // WatchJobs is used by the daprd sidecar to connect to the Scheduler 381 | // service to watch for jobs triggering back. 382 | rpc WatchJobs(stream WatchJobsRequest) returns (stream WatchJobsResponse) {} 383 | } 384 | 385 | message Job { 386 | // The schedule for the job. 387 | optional string schedule = 1; 388 | 389 | // Optional: jobs with fixed repeat counts (accounting for Actor Reminders). 390 | optional uint32 repeats = 2; 391 | 392 | // Optional: sets time at which or time interval before the callback is invoked for the first time. 393 | optional string due_time = 3; 394 | 395 | // Optional: Time To Live to allow for auto deletes (accounting for Actor Reminders). 396 | optional string ttl = 4; 397 | 398 | // Job data. 399 | google.protobuf.Any data = 5; 400 | } 401 | 402 | // TargetJob is the message used by the daprd sidecar to schedule a job 403 | // from an App. 404 | message TargetJob {} 405 | 406 | // TargetActorReminder is the message used by the daprd sidecar to 407 | // schedule a job from an Actor Reminder. 408 | message TargetActorReminder { 409 | // id is the actor ID. 410 | string id = 1; 411 | 412 | // type is the actor type. 413 | string type = 2; 414 | } 415 | 416 | // JobTargetMetadata holds the typed metadata associated with the job for 417 | // different origins. 418 | message JobTargetMetadata { 419 | oneof type { 420 | TargetJob job = 1; 421 | TargetActorReminder actor = 2; 422 | } 423 | } 424 | 425 | // JobMetadata is the message used by the daprd sidecar to schedule/get/delete a 426 | // job. 427 | message JobMetadata { 428 | // app_id is the App ID of the requester. 429 | string app_id = 1; 430 | 431 | // namespace is the namespace of the requester. 432 | string namespace = 2; 433 | 434 | // target is the type of the job. 435 | JobTargetMetadata target = 3; 436 | } 437 | 438 | // WatchJobsRequest is the message used by the daprd sidecar to connect to the 439 | // Scheduler and send Job process results. 440 | message WatchJobsRequest { 441 | oneof watch_job_request_type { 442 | WatchJobsRequestInitial initial = 1; 443 | WatchJobsRequestResult result = 2; 444 | } 445 | } 446 | 447 | // WatchJobsRequestInitial is the initial request to start watching for jobs. 448 | message WatchJobsRequestInitial { 449 | // app_id is the App ID of the requester. 450 | string app_id = 1; 451 | 452 | // namespace is the namespace of the requester. 453 | string namespace = 2; 454 | 455 | // actor_types is the optional list of actor types to watch for. 456 | repeated string actor_types = 3; 457 | } 458 | 459 | // WatchJobsRequestResult is the result of a job execution to allow the job to 460 | // be marked as processed. 461 | message WatchJobsRequestResult { 462 | // uuid is the uuid of the job that has finished processing. 463 | uint64 uuid = 1; 464 | } 465 | 466 | // WatchJobsResponse is the response message to convey the details of a job. 467 | message WatchJobsResponse { 468 | // name is the name of the job which was triggered. 469 | string name = 1; 470 | 471 | // uuid is the uuid of the job trigger event which should be sent back from 472 | // the client to be marked as processed. 473 | uint64 uuid = 2; 474 | 475 | // Job data. 476 | google.protobuf.Any data = 3; 477 | 478 | // The metadata associated with the job. 479 | JobMetadata metadata = 4; 480 | } 481 | 482 | message ScheduleJobRequest { 483 | // name is the name of the job to create. 484 | string name = 1; 485 | 486 | // The job to be scheduled. 487 | Job job = 2; 488 | 489 | // The metadata associated with the job. 490 | JobMetadata metadata = 3; 491 | } 492 | 493 | message ScheduleJobResponse { 494 | // Empty as of now 495 | } 496 | 497 | // GetJobRequest is the message used by the daprd sidecar to delete or get a job. 498 | message GetJobRequest { 499 | // name is the name of the job. 500 | string name = 1; 501 | 502 | // The metadata associated with the job. 503 | JobMetadata metadata = 2; 504 | } 505 | 506 | // GetJobResponse is the response message to convey the details of a job. 507 | message GetJobResponse { 508 | // The job to be scheduled. 509 | Job job = 1; 510 | } 511 | 512 | // DeleteJobRequest is the message used by the daprd sidecar to delete or get a job. 513 | message DeleteJobRequest { 514 | string name = 1; 515 | 516 | // The metadata associated with the job. 517 | JobMetadata metadata = 2; 518 | } 519 | 520 | message DeleteJobResponse { 521 | // Empty as of now 522 | } 523 | ``` 524 | 525 | To allow for the triggered job to be sent back to any instance of the same app id that scheduled the job, we will add: 526 | ```proto 527 | // AppCallback allows user application to interact with Dapr runtime. 528 | // User application needs to implement AppCallback service if it needs to 529 | // receive message from dapr runtime. 530 | service AppCallback { 531 | ... 532 | // Sends job back to the app's endpoint at trigger time. 533 | rpc OnJobEvent (JobEventRequest) returns (JobEventResponse); 534 | } 535 | 536 | message JobEventRequest { 537 | // Job name. 538 | string name = 1; 539 | 540 | // Job data to be sent back to app. 541 | google.protobuf.Any data = 2; 542 | 543 | // Required. method is a method name which will be invoked by caller. 544 | string method = 3; 545 | 546 | // The type of data content. 547 | // 548 | // This field is required if data delivers http request body 549 | // Otherwise, this is optional. 550 | string content_type = 4; 551 | 552 | // HTTP specific fields if request conveys http-compatible request. 553 | // 554 | // This field is required for http-compatible request. Otherwise, 555 | // this field is optional. 556 | common.v1.HTTPExtension http_extension = 5; 557 | } 558 | 559 | // JobEventResponse is the response from the app when a job is triggered. 560 | message JobEventResponse {} 561 | ``` 562 | 563 | ### Scheduler Service 564 | 565 | ![Pluggable Scheduler Service](./resources/0004-BIRS-distributed-scheduler/pluggableSchedulerService.png) 566 | 567 | A new `Scheduler Service` is created in the control plane. This Scheduler Service will include an embedded etcd instance (persisted) as well as a Scheduler Dapr API, which will live in `dapr/proto/scheduler/scheduler.proto`. The Scheduler Service is pluggable and allows for different implementations as needed. It is installed by default into the local development environment on `dapr init` similar to other control plane services in Dapr. This is an optional service and runs in a local container. 568 | 569 | To guarantee we don't have several Scheduler Service instances firing off the same job, we will have **virtual partitioning** (in-memory) such that each Scheduler Service instance owns a subset of all the jobs that exist. 570 | 571 | ### CLI 572 | 573 | `dapr job schedule --name= -–schedule=“@hourly” -–data=“”` 574 | 575 | ### Implications 576 | 577 | - The Scheduler Building Block and Service will result in the ***deprecation*** of `actor reminders` and the `bindings.cron` component. 578 | 579 | ## Expectations and alternatives 580 | 581 | * What is in scope for this proposal? 582 | * To start, this will implement generic scheduler logic, then will be expanded to enable the: 583 | * Delayed PubSub 584 | * Scheduled Service Invocation 585 | * Actor Reminders 586 | * What alternatives have been considered, and why do they not solve the problem? 587 | * Placement with Actor reminders is *very* Actor specific. The goal is to have a new scheduler in Dapr that is reusable by several building blocks. 588 | * What advantages does this proposal have? 589 | * This design has the advantage of enabling the Scheduler Service to be implemented in different ways. 590 | * This design also enables flexibility in where the data is stored. Whether that is in etcd in full, or partially with a reference to a state store (component) of the user's choice. 591 | 592 | ### Acceptance Criteria 593 | 594 | * How will success be measured? 595 | * POCs will be done to guarantee the optimal performance solution for the Scheduler Service, testing with 3 & 5 Scheduler Service instances 596 | * minimum RPS for registering reminders 597 | * minimum RPS for triggers to app 598 | * maximum number of reminders 599 | * have a backup and recovery scenario of when Scheduler cluster permanently fails 600 | 601 | ## Completion Checklist 602 | 603 | What changes or actions are required to make this proposal complete? Some examples: 604 | 605 | * Scheduler Building Block API code 606 | * Scheduler Service code 607 | * Tests added (e2e, unit) 608 | * SDK changes 609 | * Documentation 610 | 611 | -------------------------------------------------------------------------------- /0013-RS-pubsub-subscription-streaming.md: -------------------------------------------------------------------------------- 1 | # PubSub Subscription Streaming 2 | 3 | * Author(s): @joshvanl 4 | * State: Ready for Implementation 5 | * Updated: 2024-03-05 6 | 7 | ## Overview 8 | 9 | This is a design proposal to implement a new Dapr runtime gRPC and HTTP API for subscription streaming. 10 | This new gRPC & HTTP API will allow an application to subscribe to a PubSub topic and receive messages through this RPC. 11 | Applications will be able to dynamically subscribe and unsubscribe to topics, and receive messages without opening a port to receive incoming traffic from Dapr. 12 | 13 | ## Background 14 | 15 | Dapr supports applications subscribing to PubSub topic events. 16 | These subscriptions can be configured either: 17 | - `programmatically` by returning the subscription config on the app channel server on app health ready, or 18 | - `declaratively` via Subscription yaml manifests in Self-Hosted or Kubernetes mode. 19 | 20 | Today, it is not possible to dynamically update the subscription list without restarting Daprd, though hot reloading for Subscription manifests is [planned](https://github.com/dapr/dapr/issues/7139). 21 | It is common for users to want to dynamically subscribe and unsubscribe to topics inside their applications based on runtime conditions. 22 | In the cases where Dapr is not running as a sidecar, users often do not want to open a public port or create a tunnel in order to receive PubSub messages from Dapr. 23 | 24 | A streaming Subscription API will allow applications to dynamically subscribe to PubSub topics and receive messages without opening a port to receive incoming traffic from Dapr. 25 | 26 | ## Expectations and alternatives 27 | 28 | This proposal outlines the gRPC & HTTP streaming API for subscribing to PubSub topics. 29 | This proposal does _not_ address any hot-reloading functionality to the existing programmatic or declarative subscription configuration. 30 | Using a gRPC streaming API is the most natural fit for this feature, as it allows for first class long-lived bi-directional connections to Dapr to receive messages. 31 | A supplementary WebSocket based HTTP API is useful for applications which do not have a gRPC client available or HTTP WebSockets are preferred. 32 | These messages are typed RPC giving the best UX in each SDK. 33 | Once implemented, this feature will need to be implemented in all Dapr SDKs. 34 | 35 | ## Solution 36 | 37 | ### gRPC 38 | 39 | Rough gRPC PoC implementation: https://github.com/dapr/dapr/commit/ed40c95d11b78ab9a36a4a8f755cf89336ae5a05 40 | 41 | The Dapr runtime gRPC server will implement the following new RPC and messages: 42 | 43 | ```proto 44 | service Dapr { 45 | // SubscribeTopicEventsAlpha1 subscribes to a PubSub topic and receives topic events 46 | // from it. 47 | rpc SubscribeTopicEventsAlpha1(stream SubscribeTopicEventsRequestAlpha1) returns (stream TopicEventRequestAlpha1) {} 48 | } 49 | 50 | // SubscribeTopicEventsRequest is a message containing the details for 51 | // subscribing to a topic via streaming. 52 | // The first message must always be the initial request. All subsequent 53 | // messages must be event responses. 54 | message SubscribeTopicEventsRequestAlpha1 { 55 | oneof subscribe_topic_events_request_type { 56 | SubscribeTopicEventsSubscribeRequestAlpha1 request = 1; 57 | SubscribeTopicEventsResponseAlpha1 event_response = 2; 58 | } 59 | } 60 | 61 | // SubscribeTopicEventsSubscribeRequest is the initial message containing the 62 | // details for subscribing to a topic via streaming. 63 | message SubscribeTopicEventsSubscribeRequestAlpha1 { 64 | // The name of the pubsub component 65 | string pubsub_name = 1 [json_name = "pubsubName"]; 66 | 67 | // The pubsub topic 68 | string topic = 2 [json_name = "topic"]; 69 | 70 | // The metadata passing to pub components 71 | // 72 | // metadata property: 73 | // - key : the key of the message. 74 | optional map metadata = 3 [json_name = "metadata"]; 75 | 76 | // dead_letter_topic is the topic to which messages that fail to be processed 77 | // are sent. 78 | optional string dead_letter_topic = 4 [json_name = "deadLetterTopic"]; 79 | 80 | // max_in_flight_messages is the maximum number of in-flight messages that 81 | // can be processed by the subscriber at any given time. 82 | // Default is no limit. 83 | optional max_in_flight_messages = 5 [json_name = "maxInFlightMessages"]; 84 | } 85 | 86 | // SubscribeTopicEventsResponse is a message containing the result of a 87 | // subscription to a topic. 88 | message SubscribeTopicEventsResponseAlpha1 { 89 | // id is the unique identifier for the subscription request. 90 | string id = 1 [json_name = "id"]; 91 | 92 | // status is the result of the subscription request. 93 | TopicEventResponseAlpha1 status = 2 [json_name = "status"]; 94 | } 95 | ``` 96 | 97 | When an application wishes to subscribe to a topic it will initiate a stream with `SubscribeTopicEventsRequest`, and `Send` the initial request `SubscribeTopicEventsSubscribeRequest` containing the options for the subscription. 98 | Daprd will then setup the machinery to add this gRPC RPC stream to the set of subscribers. 99 | The request contains no route or path matching configuration as all events will be sent on this stream. 100 | Subscription gRPC streams are the highest priority when Daprd determines which publisher a message should be sent to. 101 | Only a single PubSub Topic pair may be subscribed at a single time with this API. 102 | If the first message sent to the server is not the initial request, the RPC will return an error. 103 | If any subsequent messages are not `SubscribeTopicEventsResponse` messages, the RPC will return an error. 104 | 105 | When a message is published to the topic, Daprd will send a `TopicEventRequest` message on the stream containing the message payload and metadata. 106 | After the application has processed the message, it will send to the server a `SubscribeTopicEventsResponse` containing the `id` of the message and the `status` of the message processing. 107 | Since multiple messages can be sent and processed in the application at the same time, the event `id` is used by the server to track the status of each individual event. 108 | An event topic response will follow the timeout resiliency as currently exist for subscriptions. 109 | 110 | Client code: 111 | 112 | ```go 113 | stream, _ := client.SubscribeTopicEventsAlpha1(ctx) 114 | stream.Send(&rtv1.SubscribeTopicEventsRequestAlpha1{ 115 | SubscribeTopicEventsRequestTypeAlpha1: &rtv1.SubscribeTopicEventsRequest_RequestAlpha1{ 116 | Request: &rtv1.SubscribeTopicEventsSubscribeRequestAlpha1{ 117 | PubsubName: "mypub", Topic: "a", 118 | }, 119 | }, 120 | }) 121 | 122 | client.PublishEvent(ctx, &rtv1.PublishEventRequest{ 123 | PubsubName: "mypub", Topic: "a", 124 | Data: []byte(`{"status": "completed"}`), 125 | DataContentType: "application/json", 126 | }) 127 | 128 | event, _ := stream.Recv() 129 | stream.Send(&rtv1.SubscribeTopicEventsRequestAlpha1{ 130 | SubscribeTopicEventsRequestType: &rtv1.SubscribeTopicEventsRequest_EventResponseAlpha1{ 131 | EventResponse: &rtv1.SubscribeTopicEventsResponseAlpha1{ 132 | Id: event.Id, 133 | Status: &rtv1.TopicEventResponse{Status: rtv1.TopicEventResponse_SUCCESS}, 134 | }, 135 | }, 136 | }) 137 | 138 | stream.CloseSend() 139 | ``` 140 | 141 | ### HTTP (WebSockets) 142 | 143 | Along with a gRPC based streaming API, a WebSocket based HTTP equivalent API will be implemented. 144 | Much like the gRPC API, the HTTP based WebSocket API will follow an initial request-response handshake, followed by a stream of messages to the client with status responses by the client, indexed by the message ID. 145 | The same proto types as using in the gRPC API (but in JSON blobs) will be used for the HTTP API. 146 | The server WebSocket implementation will be based on the [gorilla/websocket](https://github.com/gorilla/websocket) package, as this seems well used, understood and maintained. 147 | 148 | The HTTP streaming API will be available at the following endpoint. 149 | As the pubsub and topic information is in the request body, no request configuration is given in the URL. 150 | 151 | ``` 152 | GET: /v1.0-alpha1/subscribe 153 | ``` 154 | 155 | ```json 156 | INITIAL_REQUEST (to server) = { 157 | "pubsubName": "mypub", 158 | "topic": "a", 159 | "metadata": { 160 | "key": "value" 161 | }, 162 | "deadLetterTopic": "dead-letter-topic", 163 | "maxInFlightMessages": 10 164 | } 165 | 166 | TOPIC_EVENT_REQUEST (to application) = { 167 | "id": "123", 168 | "source": "asource", 169 | "type": "atype", 170 | "spec_version": "1.0", 171 | "data_content_type": "application/json", 172 | "data": "abc", 173 | "topic": "a", 174 | "pubsub_name": "mypub", 175 | "path": "/" 176 | } 177 | 178 | TOPIC_EVENT_RESPONSE (to server) = { 179 | "id": "123", 180 | "status": { 181 | "status": "SUCCESS" 182 | } 183 | } 184 | ``` 185 | 186 | ## Completion Checklist 187 | 188 | - [ ] gRPC server implementation in daprd 189 | - [ ] API documentation 190 | - [ ] SDK implementations 191 | - [ ] .Net 192 | - [ ] Java 193 | - [ ] Go 194 | - [ ] Python 195 | - [ ] JavaScript 196 | -------------------------------------------------------------------------------- /20221025-P-proposal-process.md: -------------------------------------------------------------------------------- 1 | # Improving the Dapr proposal process 2 | 3 | * Author(s): John Ewart (@johnewart), Mukundan Sundararajan (@mukundansundar) 4 | * State: Approved 5 | * Date: 10/25/2022 6 | 7 | ## Overview 8 | 9 | This proposal is to formalize the structure and lifecycle to proposals with three primary goals: first, make it easier for contributors to both put forth proposals as well as review them, second, increase the clarity and focus of proposals themselves, and, third, provide guidance on what is expected for a well defined feature to be "dev complete". In order to do this, we should: 10 | 11 | 1. Create a template for new proposals. 12 | 2. Define the core requirements for a feature that is being proposed to be considered complete. 13 | 3. Implement a process for reviewing and accepting new proposals. 14 | 15 | ## Background 16 | 17 | ### Why define a proposal process and templatize it? 18 | 19 | As a community project, Dapr relies on contributors to help advance the project the goal of this proposal is to simplify the process of contributing, and evaluating, new ideas. We want to make it more inviting for community members to propose new ideas (or evaluate them) as well as ensure that the time being spent evaluating proposals or working on new features is well spent. 20 | 21 | Adding clarity to the proposal process, as well as some amount of structure, will hopefully make it easier for contributors of all experience levels to both contribute and review new ideas. As a new contributor it can sometimes feel a bit daunting to propose a new idea -- not knowing quite where to start, whether or not what you are proposing has the right level of information, etc. Having structure makes it easier for a new contributor who wants to propose a new idea to know what is expected and feel confident that their proposal meets those expectations. In addition, for anyone putting forward a proposal, the structure proposed prompts thinking about how someone else would use the feature, how they might benefit from it, and what other ways the feature they are proposing might be solved using existing features (or other technology). 22 | 23 | On the other side of the equation are the community members who are reviewing those proposals; it can be challenging to review something if you feel that information or context is lacking. A consistent structure means that reviewers can know what to expect out of a proposal document and clearly ask for more information if some is missing. And, the suggested structure would make sure that reviewers have the right information needed to have a conversation about the proposal (as well as reduce the scope of the review). 24 | 25 | As a community, we want to be welcoming of new people and also respectful of the time and energy that everyone devotes to make this project great. I believe that adding this small amount of structure to the proposal process will help not only make it easier to propose new ideas, but also ensure that everyone who is participating can make the best use of the time they have available to improve Dapr! 26 | 27 | ### Why define minimum requirements for a feature to be complete? 28 | 29 | As Dapr increases in scope and brinds on more contributions, it is important that we define what we expect before a new feature is added to Dapr. In order to assure that all aspects of the feature have been completed for release we need to provide clear guidance on what needs to be accomplished before it is accepted into Dapr. This will help us to qualify these features as complete for a particular release milestone and be confident in what is being released. For example, most features would require, at a minumum: 30 | 31 | * Completion of the code 32 | * Maintainer signoff on the implementation by the feature freeze date 33 | * Code merged into the main branch well before the code freeze date (feature freeze date at the latest) 34 | * During the time between feature freeze and code freeze, any P0 regressions/bugs related to this feature that are identified need to be fixed. 35 | * Adding / updating performance tests, e2e tests etc. 36 | * Documentation for the new feature has been committed to `dapr/docs` 37 | * Creating / updating quick starts, tutorials (if relevant) 38 | 39 | In addition, some features would require changes to SDKs or have additional requirements in order for it to be considered fully complete, and so those requirements should also be tracked in order to ensure completion. 40 | 41 | 42 | ## Background vs Design 43 | 44 | > "Your scientists were so preoccupied with whether they could, they didn’t stop to think if they should." - Dr. Ian Malcolm (Jurassic Park) 45 | 46 | The intent of the proposal is to focus on three primary areas: 47 | 48 | * _What_ the proposal is putting forth 49 | * _Why_ the proposal is required (what will it do for users?) 50 | * _How_ the proposal will work 51 | 52 | 53 | ### A bit about the _why?_ 54 | 55 | In order to be effective, a proposal must provide both the background on the idea: what it is and how it works (at a high level) as well as _why it should be introduced to Dapr_. The why part of this proposal is just as important (if not more so) than what is being proposed, as it lets the reviewers understand better what kinds of use cases that the feature shall enable, how it shall make Dapr better or improve the experience of Dapr users. By going through and clarifying _why_ something should be added to Dapr, it forces us as developers to think carefully about what we are taking on as a community and how it will impact others - the benefits and potential drawbacks that it might bring. 56 | 57 | In addition, it must also convey what is in scope and what is out of scope (i.e what things have been deliberately omitted) along with any alternatives that have been considered and why they were not a good fit. 58 | 59 | ### Design 60 | 61 | The second half of the process focuses on the implementation of the proposal - the goal of this part is to show the community not only how it will operate, but also provide information on how success shall be measured and also include a list of activities that must be completed in order for this proposal to be complete. 62 | 63 | ## Proposed templates for Design and Build Phases 64 | 65 | See the following file: [templates/proposal.md](templates/proposal.md) 66 | 67 | ## Related Items 68 | 69 | ### Related proposals 70 | 71 | N/A 72 | 73 | ### Related issues 74 | 75 | N/A 76 | 77 | ## Expectations and alternatives 78 | 79 | ### What is in scope for this proposal? 80 | 81 | This proposal covers the process for creating, storing, and reviewing proposals. The intention is to improve the process and increase clarity around proposals, their status, and their design. 82 | 83 | ### What is deliberately *not* in scope? 84 | 85 | The planning process for including proposals in a given release is not a part of this proposal, the assumption is that process will continue to operate as it currently does. 86 | 87 | ### What advantages / disadvantages does this proposal have? 88 | 89 | This proposal has the advantage of increasing clarity of proposals as well as implicitly creating a record of design decisions; however, it is a little more involved and structured than the previous process, which may be viewed as a disadvantage. The authors of this proposal believe that the advantages significantly outweigh any potential disadvantages, however. 90 | 91 | ## Implementation Details 92 | 93 | ### Completion Checklist 94 | 95 | - [ ] A new repository (`dapr/proposals`) is created as a copy of this repository 96 | - [ ] Any relevant documentation around submitting proposals or the development process point to this new repository and process 97 | - [ ] Migration of existing _in-flight_ proposals from GitHub issues to this repository 98 | - [ ] _(Optional)_ Migration of previous proposals to this repository 99 | -------------------------------------------------------------------------------- /20221121-R-pluggable-components-injector.md: -------------------------------------------------------------------------------- 1 | # Pluggable components injector 2 | 3 | - Author(s): Marcos Candeia (@mcandeia) 4 | - State: Ready for Implementation 5 | - Updated: 11/21/2022 6 | 7 | ## Overview 8 | 9 | Pluggable components are components that are not included as part of the runtime, as opposed to built-in ones that are included. The major difference between pluggable components and built-in components is the operational burden related to bootstrap/start the pluggable component process that are not necessary when using a built-in one since they run in the same process as Dapr runtime. This operational burden is present in many ways when using pluggable components and can lead to errors and hard debugging. In addition, there are certain configurations that are tied to the Dapr and how the runtime registers the pluggable component that is repetitive and can be better handled by Dapr instead of delegating this responsibility to the end-user. This proposal suggest the addition of a new mode of execution for selected pluggable components: injectable pluggable components. 10 | 11 | ## Background 12 | 13 | #### Decrease the operational burden 14 | 15 | Even considering the new pluggable components annotation from [#5402](https://github.com/dapr/dapr/issues/5402), setting up applications to properly work with pluggable components still not an easy task due to the operational related to bootstrapping containers over and over again for each application that the user needs, especially if you consider that components are often not well [scoped](https://docs.dapr.io/operations/components/component-scopes/). Without scope, a component make itself available for all applications within the same namespace, meaning that every deployment/pod should re-do the same manual job of mounting volumes, declaring environment variables and pinning container images. 16 | 17 | So let's say you have an application named `my-app` and, another one named `my-app-2`, your two deployments/pods will look like the following: 18 | 19 | ```yaml 20 | apiVersion: apps/v1 21 | kind: Deployment 22 | metadata: 23 | name: app 24 | labels: 25 | app: app 26 | spec: 27 | replicas: 1 28 | selector: 29 | matchLabels: 30 | app: app 31 | template: 32 | metadata: 33 | labels: 34 | app: app 35 | annotations: 36 | dapr.io/pluggable-components: "component" 37 | dapr.io/app-id: "my-app" 38 | dapr.io/enabled: "true" 39 | spec: 40 | volumes: 41 | - name: my-component-required-volume 42 | emptyDir: {} 43 | containers: 44 | - name: my-app 45 | image: my-app-image:latest 46 | ### This is the pluggable component container. 47 | - name: component 48 | image: component:v1.0.0 49 | volumes: 50 | - name: my-component-required-volume 51 | mountPath: "/my-data" 52 | env: 53 | - name: MY_ENV_VAR_NAME 54 | value: MY_ENV_VAR_VALUE 55 | 56 | --- 57 | apiVersion: apps/v1 58 | kind: Deployment 59 | metadata: 60 | name: app-2 61 | labels: 62 | app: app-2 63 | spec: 64 | replicas: 1 65 | selector: 66 | matchLabels: 67 | app: app-2 68 | template: 69 | metadata: 70 | labels: 71 | app: app-2 72 | annotations: 73 | dapr.io/pluggable-components: "component" 74 | dapr.io/app-id: "my-app-2" 75 | dapr.io/enabled: "true" 76 | spec: 77 | volumes: 78 | - name: my-component-required-volume 79 | emptyDir: {} 80 | containers: 81 | - name: my-app-2 82 | image: my-app-2-image:latest 83 | ### This is the pluggable component container. 84 | - name: component 85 | image: component:v1.0.0 86 | volumes: 87 | - name: my-component-required-volume 88 | mountPath: "/my-data" 89 | env: 90 | - name: MY_ENV_VAR_NAME 91 | value: MY_ENV_VAR_VALUE 92 | ``` 93 | 94 | Notice that everything related to the pluggable component container is repeated, and if you have a third application that doesn't require your pluggable component to work, so you have to scope your component to be initialized with only these two declared deployments/pods. 95 | 96 | ```yaml 97 | apiVersion: dapr.io/v1alpha1 98 | kind: Component 99 | metadata: 100 | name: my-component 101 | spec: 102 | type: state.my-component 103 | version: v1 104 | metadata: [] 105 | scopes: 106 | - "my-app" 107 | - "my-app-2" 108 | ``` 109 | 110 | For each deployment that you have to add in your cluster, if that requires such pluggable component, you must also add in the scope list of the component spec, which ends up being error prone and intrusive. 111 | 112 | #### Component spec atomicity/self-contained 113 | 114 | Allow interchangeable/swappable components are one of the top amazing features that we provide, with that, a user can, in runtime, swap out a component with the same interface for another. Pluggable components made this behavior more difficult to maintain as it requires coordination, for a small time window, the user must provide a way to Dapr access both components at same time, otherwise it becomes very difficult to orchestrate that change manually. 115 | To exemplify, suppose that we want to replace the Redis PubSub with the Kafka PubSub, and they are pluggable components. This is not only a matter of replacing the component spec itself, but it will require orchestrating the related deployments, otherwise it would lead in having an application pointing out to Kafka but with no Kafka pluggable component running and vice-versa. 116 | 117 | The following diagram is exemplifying how that orchestrated change must applied: 118 | 119 | image 120 | 121 | > That can't be avoided in scenarios where Dapr is not present as an orchestrator, for instance, self-hosted mode, but there are platforms that supports extensibility for orchestrating applications and its dependencies, like Kubernetes. 122 | 123 | re: You can argue that Kubernetes solve this scenario by reconciling the cluster state until it succeeds, but still, it severely degrade the user experience when requires additional knowledge to build their applications with Dapr. 124 | 125 | ## Related Items 126 | 127 | ### Related proposals 128 | 129 | [Pluggable components Annotations](https://github.com/dapr/dapr/issues/5402) 130 | 131 | ### Related issues 132 | 133 | N/A 134 | 135 | ## Expectations and alternatives 136 | 137 | ### What is in scope for this proposal? 138 | 139 | This proposal aims to add a new execution mode for pluggable components, the dapr-injected pluggable components. 140 | 141 | ### What is deliberately _not_ in scope? 142 | 143 | This proposal does not aims to manage users' pluggable components code. The goal here is to provide a better UX when using pluggable components while decrease the operation burden. 144 | 145 | ## Implementation Details 146 | 147 | ### Design 148 | 149 | This proposal aims to add a new execution mode for pluggable components, the dapr-injected pluggable components, that makes the operational behind remarkable like the built-in components. The operational burden is still present somewhere but divided into small reusable pieces. 150 | 151 |

 

152 | 153 | | Type | Injected by Dapr | Managed by User/Unmanaged | 154 | | ----------------- | ---------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------- | 155 | | Configuration | Dapr Injects env vars and mount the shared volumes | The user manually mounts and declares shared volumes | 156 | | Container updates | Dapr automatically detects and applies, rolling out changes based on declared components | Users must redeploy their applications with the new desired version | 157 | | Persona | Cluster operator/End user | End user | 158 | | Scope | Does not need to be scoped | If not scoped, all applications should have deployed the pluggable component, otherwise runtime errors might happen | 159 | 160 |
161 | 162 | #### Component spec annotations 163 | 164 | The component spec is still the entry point for all component types being pluggable or not, given that the pluggable components are a subset of all users declared components, even more, the pluggable components can be inferred from the declared components, we can actually leverage that property to extend our component spec, by adding custom annotations to allow Dapr to inject the component container at the time the Injector is also injecting the Dapr sidecar container. 165 | 166 | Example: 167 | 168 | ```yaml 169 | apiVersion: dapr.io/v1alpha1 170 | kind: Component 171 | metadata: 172 | name: my-component 173 | annotations: 174 | dapr.io/component-container-image: "component:v1.0.0" 175 | spec: 176 | type: state.my-componentƒ 177 | version: v1 178 | metadata: [] 179 | ``` 180 | 181 | Optionally you can mount volumes and add env variables into the containers by using the `dapr.io/component-container-volume-mounts(-rw)` and `dapr.io/component-container-env` annotations. 182 | 183 | ```yaml 184 | apiVersion: dapr.io/v1alpha1 185 | kind: Component 186 | metadata: 187 | name: my-component 188 | annotations: 189 | dapr.io/component-container-image: "component:v1.0.0" 190 | dapr.io/component-container-volume-mounts: "volume-name:/volume-path,volume-name-2:/volume-path-2" # read-only, "$VOLUME_NAME:$VOLUME_PATH,$VOLUME_NAME_2:$VOLUME_PATH2" 191 | dapr.io/component-container-volume-mounts-rw: "volume-name-rw:/volume-path-rw,volume-name-2-rw:/volume-path-2-rw" # read-write "$VOLUME_NAME:$VOLUME_PATH,$VOLUME_NAME_2:$VOLUME_PATH2" 192 | dapr.io/component-container-env: "env-var=env-var-value,env-var-2=env-var-value-2" #optional "$ENV_NAME=$ENV_VALUE,$ENV_NAME_2=$ENV_VALUE_2" 193 | spec: 194 | type: state.my-component 195 | version: v1 196 | metadata: [] 197 | ``` 198 | 199 | By default the injector creates undeclared volumes as `emptyDir` volumes, if you want a different volume type you should declare it by yourself in your pods. 200 | 201 | #### Pod annotations 202 | 203 | In order to allow users to turn off the component injector for their pod, a new annotation will be available, similar to the one that we have for enabling dapr: `dapr.io/inject-pluggable-components:"true"`. Let's rewrite the previous examples using the injected pluggable components feature, it would be something like: 204 | 205 | The apps deployments/pods: 206 | 207 | ```yaml 208 | apiVersion: apps/v1 209 | kind: Deployment 210 | metadata: 211 | name: app 212 | labels: 213 | app: app 214 | spec: 215 | replicas: 1 216 | selector: 217 | matchLabels: 218 | app: app 219 | template: 220 | metadata: 221 | labels: 222 | app: app 223 | annotations: 224 | dapr.io/inject-pluggable-components: "true" 225 | dapr.io/app-id: "my-app" 226 | dapr.io/enabled: "true" 227 | spec: 228 | containers: 229 | - name: my-app 230 | image: my-app-image:latest 231 | --- 232 | apiVersion: apps/v1 233 | kind: Deployment 234 | metadata: 235 | name: app-2 236 | labels: 237 | app: app-2 238 | spec: 239 | replicas: 1 240 | selector: 241 | matchLabels: 242 | app: app-2 243 | template: 244 | metadata: 245 | labels: 246 | app: app-2 247 | annotations: 248 | dapr.io/inject-pluggable-components: "true" 249 | dapr.io/app-id: "my-app-2" 250 | dapr.io/enabled: "true" 251 | spec: 252 | containers: 253 | - name: my-app-2 254 | image: my-app-2-image:latest 255 | ``` 256 | 257 | And the component spec: 258 | 259 | ```yaml 260 | apiVersion: dapr.io/v1alpha1 261 | kind: Component 262 | metadata: 263 | name: my-component 264 | annotations: 265 | dapr.io/component-container-image: "component:v1.0.0" 266 | dapr.io/component-container-volume-mounts: "my-component-required-volume;/my-data" 267 | dapr.io/component-container-env: "MY_ENV_VAR_NAME;MY_ENV_VAR_VALUE" 268 | spec: 269 | type: state.my-component 270 | version: v1 271 | metadata: [] 272 | ``` 273 | 274 | ### Feature lifecycle outline 275 | 276 | #### Expectations 277 | 278 | The feature is expected to be delivered as part of dapr/dapr v1.10.0 as a preview feature together with the new pluggable components SDK. 279 | 280 | #### Compatability guarantees 281 | 282 | Pluggable components that has been used will not be affected by this. 283 | 284 | #### Deprecation / co-existence with existing functionality 285 | 286 | N/A 287 | 288 | ### Acceptance Criteria 289 | 290 | N/A 291 | 292 | ## Completion Checklist 293 | 294 | What changes or actions are required to make this proposal complete? Some examples: 295 | 296 | - [] Change the sidecar injector to make requests to the operator for listing components (or list it using its own role) 297 | - [] Add 1 more annotation for pods `dapr.io/inject-pluggable-components: "true"` and 3 more for components `dapr.io/component-container-image`, `dapr.io/component-container-env` and `dapr.io/component-container-volume-mounts` 298 | - [] Add the components container injector based on declared components 299 | -------------------------------------------------------------------------------- /20221130-I-enhance-dapr-run-multiple-apps.md: -------------------------------------------------------------------------------- 1 | # Run multiple applications with Dapr sidecars 2 | 3 | Author(s): Mukundan Sundararajan 4 | 5 | State: Ready for implementation 6 | 7 | Updated: 30th Nov 2022 8 | 9 | ## Overview 10 | 11 | This is a proposal for feature to be included in the dapr CLI which allows easy way to start multiple services that needs to be run in tandem along with their `daprd` sidecars in local self-hosted mode. 12 | 13 | ## Background 14 | 15 | Currently to run multiple services along with `dapr sidecar` locally, users need to run multiple `dapr run` commands, keep track of all ports opened, the components folders that each service refers to, the config file each service refers to etc. 16 | There are also multiple other flags that can be used to tweak behavior of `dapr run` command eg: `--unix-domain-socket`, `--dapr-internal-grpc-port`, `--app-health-check-path` etc. 17 | 18 | This increases the complexity of using dapr in development, where users want to run multiple services in local mode and be able to partially/fully replicate the production sceanrio. 19 | 20 | In K8s mode this is alleviated through the use of helm/deployment YAML files. There is currently no such capability available for local self hosted mode. 21 | 22 | Asking a user to run multiple different `dapr run` commands each with different flags, increases the complxity for users onboarding onto Dapr. 23 | 24 | ### Why dapr CLI? 25 | 26 | From the initial [proposal](https://github.com/dapr/community/issues/207), the solution was proposed as a seprate repo and CLI in itself. But later it was suggested to use dapr CLI itself to have a `compose` command in it instead of a having a separate CLI. 27 | The main reason for including it in the dapr CLI itself is that, users do not have to download and use a new CLI in addition to the dapr CLI. 28 | 29 | This feature is tightly coupled and opinionated on how `dapr` is going to be run locally, and having a separate CLI `dapr-compose` deciding on how `dapr` CLI should be used, is not a good pattern to start with. 30 | 31 | > Note: `daprd` is more generally used and considered a binary and not necessarily a CLI tool. So `dapr` CLI is not making use of another CLI but rather passing on configs for running a binary. 32 | 33 | ## Related Items 34 | 35 | ### Related Proposals 36 | - https://github.com/dapr/community/issues/207 37 | - https://github.com/dapr/cli/issues/1123 38 | 39 | ### Related Issues 40 | 41 | ## Expectations and alternatives 42 | 43 | The scope of this proposal is to enhance the `run` CLI command which allowing users to define and run multiple services from a single run configuration file. 44 | 45 | This proposal specifically targets running in local environments and `slim` mode, where container engines are available. For running `daprd` container along with `app` container the solution is to use Kubernetes or docker-compose. 46 | 47 | For this proposal we will targetting runnning the applications and side cars as processes in the OS. 48 | 49 | > Note: All other commands in `dapr` CLI for self hosted mode are written to work with processes 50 | 51 | ## Requirements 52 | 53 | The main requirements for the command: 54 | - being able to configure multiple dapr apps from a single configuration file 55 | - users should be able to use normal `dapr` CLI commands for self hosted mode against any apps that are started through `dapr compose` 56 | 57 | Additional requirement for this feature is to come up with conventions on how to organize/run Dapr projects locally. 58 | 59 | ## Proposed Structure for organizing Dapr projects locally 60 | 61 | Currently `dapr` CLI initializes in the home directory(user profile dir for windows) a folder called `.dapr` and the default configurations and resources to be used are stored there. 62 | 63 | Users developing different apps using dapr will have different resources/configurations that they use per application. Each time the user has to run the application with a particular config and resources directory value, they have to override the flag for `dapr run` command. 64 | 65 | Instead the following convention is proposed for loading the resources/config for an application. 66 | The command expects the following directory structure: 67 | ``` 68 | .dapr/ 69 | |____ config.yaml 70 | | 71 | |____ resources/ 72 | | 73 | |____ statestore.yaml 74 | |____ pubsub.yaml 75 | |____ resiliency_conf.yaml 76 | |____ subscription.yaml 77 | ``` 78 | In each app directory,there can be a `.dapr` folder, which contains a `resources` directory and a `config.yaml` file. If that directory is not present, the default locations is used which are `~/.dapr/resources/` and `~/.dapr/config.yaml` (`%USERPROFILE%` instead of `~` for windows). 79 | 80 | > Note: This change will be made in `dapr run` only when the newly introduced `-f` flag is used. See [below](#precedence-rules) for details on which folder content will take preceedence when a run configuration is given as input. 81 | 82 | > Note: This change does not impact the `bin` folder where `dapr` CLI looks for the `daprd` and `dashboard` binaries. That will still remain the same `~/.dapr/bin/` (%USERPROFILE% for windows). 83 | 84 | ## Proposed Structure for run configuration file 85 | 86 | > Expected default file name is `dapr.yaml` 87 | 88 | ```yaml 89 | version: 1 90 | common: 91 | resources_dir: ./app/components # any dapr resources to be shared across apps 92 | env: # any environment variable shared among apps 93 | - DEBUG: true 94 | apps: 95 | - app_id: webapp 96 | app_dir: ./webapp/ 97 | resources_dir: ./webapp/components # (optional) can be default by convention too, ignore if dir is not found. 98 | config_file: ./webapp/config.yaml # (optional) can be default by convention too, ignore if file is not found. 99 | app_protocol: HTTP 100 | app_port: 8080 101 | app_health_check_path: "/healthz" # All _ converted to - for all properties defined under daprd section 102 | command: ["python3" "app.py"] 103 | - app_id: backend 104 | app_dir: ./backend/ 105 | app_protocol: GRPC 106 | app_port: 3000 107 | unix_domain_socket: "/tmp/test-socket" 108 | env: 109 | - DEBUG: false 110 | command: ["./backend"] 111 | ``` 112 | > Note: Running the dependencies for each app as contianers is out of scope for this discussion initially. We might consider that in the future. 113 | 114 | - Each file contains a `common` object which contains `env`, `resources_dir` and `config_file` that can be used in common across all the apps defined in this YAML 115 | - There is an `apps` section that lists the different app configs. 116 | - Each app config has the following 117 | - `app_id` application ID (mandatory field). Passed to `daprd` as `--app-id`. 118 | - `app_dir` directory of the application (mandatory field). 119 | - `resources_dir` (optional) directory(ies) of all dapr resources (components, resiliency policies, subscription crds) (overrides common def). Passed to `daprd` as `--resources-dir`. 120 | - `config_file` (optional) the configuration file to be used for this app (overrides common def). Passed to `daprd` as `--config-file`. 121 | - `app_protocol` Application protocol, HTTP, gRPC defaults to HTTP. Passed to `daprd` as `--app-protocol`. 122 | - `app_port` port the app listens to if any. Passed to `daprd` as `--app-port`. 123 | - other dapr run parameters (mostly pass through flags to `daprd`) All properties must have `_` as separators which will be validated(so that no unknown flags are passed) and translated to `-` for cmd line arguments for `daprd`. 124 | - `command` ["exec" "arg1" "arg2"] format for application command 125 | - `env` which overrides or adds to common env var defined or the shell env var passed in when `dapr compose` is called 126 | 127 | The DAPR_HTTP_PORT and DAPR_GRPC_PORT will be passed in as extra environment variables to the application that is being run. Those flags for `daprd` can be overridden in the run configuration file above but that is optional as random ports will be assigned as needed. 128 | 129 | ### Precedence rules 130 | 131 | For `env` field: 132 | > Note: In addition to the defined env fields the app also gets the `DAPR_HTTP_PORT` and `DAPR_GRPC_PORT` fields. 133 | 134 | - If no field is present, the environment variables of the current shell which executes the CLI command is passed to the app and dir. 135 | - if `env` field is present in the `common` section, in addition to the shell environment variables, the `env` map defined will be passed to all `apps` and `daprd` sidecars. 136 | - if `env` field is present only in a particular `apps` section, any shell environment variables, `env` maps from `common` section and the `env` map for the current app will be passed to both the `app` and `daprd`. 137 | - The more specific `env` key-value pairs will override the lesser specific ones i.e. `apps` section specific `env` key-value pairs will override the key-value pairs from the `common` section which will override the passed in shell environment variables. 138 | 139 | 140 | For each app in the `apps` section, the `resources_dir` and `config_file` values will be resolved in the following order: 141 | 142 | - If `reosurce_dir` and/or `config_file` fields are given for any `apps[i]` configuration use that value as is. 143 | - If not, check for `.dapr` folder in the **`apps[i].app_dir`** folder. If found, use the `.dapr/resources` folder for configuring the resources and `.dapr/config.yaml` for the `daprd` configuration file(argument `--config-file` in `daprd`). 144 | - If not, check if a `resources_dir` and/or `config_file` fields are defined in the `common` section of the compose configuration file. If so use those values for those fields. 145 | - If not, default to `~/.dapr/resources/` for `resources_dir` and `~/.dapr/config.yaml` for `config_file` values. 146 | 147 | 148 | ## Proposed command format 149 | 150 | Given the run configuration file defined above, there should be a way to use the configuration file and run the differnt commands and `daprd` with the configuration given in the file. 151 | 152 | For this there will be a flag `-f, --file` that will be defined in the `dapr run` command. If the input path is a `file`, it expectes the file to have [structure defined above](#proposed-structure-for-run-configuration-file). 153 | If the path for the flag is a directory, then it expects the `dapr.yaml` file to be present in the directory with the same [structure defined above](#proposed-structure-for-run-configuration-file). 154 | 155 | ### Interaction flow 156 | 157 | The interaction flow for the `dapr run -f ` is shown as below. 158 | 159 | ![interaction flow](./resources/0003-I-enhance-dapr-run-multiple-apps/interaction-flow-1.png) 160 | 161 | > Note: app-id needs to be unique across all applications that have been run using `dapr run`. 162 | ### Logging options 163 | 164 | Right `dapr run` executes as a foreground interactive process, and both the `daprd` logs and associated application logs are directly written to the STDOUT of the `dapr run` _process shell_ and it is not stored anywhere. 165 | 166 | Considering that executing `dapr run -f ` will run multiple applciations, routing all the logs to STDOUT for all applications and `daprd` processes will make the STDOUT completely chaotic and the user will be overwhelmed with log output. 167 | 168 | For example, consider two applications `order-proc` and `checkout` that are run on executing `dapr run -f ` and logs are routed to STDOUT: 169 | 170 | ``` 171 | ==APP== waiting for daprd to start 172 | INFO[0000] enabled gRPC tracing middleware app_id=order-proc instance=Mukundans-MacBook-Pro.local scope=dapr.runtime.grpc.api type=log ver=1.9.3 173 | ==APP==pinging dapr API 174 | INFO[0000] enabled gRPC tracing middleware app_id=checkout instance=Mukundans-MacBook-Pro.local scope=dapr.runtime.grpc.api type=log ver=1.9.3 175 | INFO[0000] started daprd app_id=order-proc instance=Mukundans-MacBook-Pro.local scope=dapr.runtime.grpc.api type=log ver=1.9.3 176 | ==APP== starting the application 177 | ==APP== processing request 1 178 | ==APP== processing request 2 179 | INFO[0000] started daprd app_id=checkout instance=Mukundans-MacBook-Pro.local scope=dapr.runtime.grpc.api type=log ver=1.9.3 180 | ==APP== processing request 2 181 | ==APP== processing request 3. request 3 calls dapr API 182 | INFO[0000] requst 3 is calls dapr API app_id=order-proc instance=Mukundans-MacBook-Pro.local scope=dapr.runtime.grpc.api type=log ver=1.9.3 183 | ==APP== processing request 3 184 | 185 | ``` 186 | 187 | The logs will be as shown above, it will be chaotic to see what application is in which state and so on. 188 | 189 | Instead of writing to STDOUT, each application and associated `daprd` process will write the logs to `{application dir}/.dapr/logs/{app id}/app_{datetime}.log`, `{application dir}/.dapr/logs/{app id}/daprd_{datetime}.log`. 190 | 191 | ## Feature lifecycle outline 192 | 193 | Compatibility with `dapr run` is expected to be maintained. But in certain cases there might be introduction of new behavior which might be _opt-in_ for running individual applications using `dapr run` whereas it might be _on by default_ when `dapr run -f ` is used. 194 | 195 | The expectation is for this feature to be refined and stabilized over a series of releases. 196 | 197 | ### Recommendation for initial version 198 | 199 | - Initial implementation will only support Linux OS. 200 | - `dapr run -f ` will be an interactive process running completely for the lifecycle of applications, on exiting the process, all other spawned processes will also quit. 201 | - logs will be written to a predefined location. Users will need to manually tail the file. (Present in app-dir) 202 | - [optional] change the full `dapr run` command itself to honor [proposed organization for dapr projects](#proposed-structure-for-organizing-dapr-projects-locally). If not only when `-f` flag is used will the proposed organizing structure be used. 203 | 204 | ### Changes for future releases 205 | 206 | - Extend support for Windows and macOS. 207 | - Extend `dapr run` to have a detached mode `-d, --detach` flag. This will also be honoroed when running multiple applications using the `-f` flag. 208 | - Add support for `dapr logs` to query from saved logs when the application is being run. 209 | 210 | 211 | ## Completion checklist 212 | For initial version 213 | - [ ] Implement initail version of `dapr run -f ` feature 214 | - [ ] Add E2E tests for this feature 215 | - [ ] Add documenteation for this feature 216 | 217 | For later 218 | - [ ] Enahance `dapr run` command to have a detached mode 219 | - [ ] Enahance `dapr logs` command to track and output logs in self-hosted mode 220 | -------------------------------------------------------------------------------- /20230327-RCBS-Crypto-building-block.md: -------------------------------------------------------------------------------- 1 | # Crypto(graphy) building block 2 | 3 | - Author(s): Alessandro Segala (@ItalyPaleAle) 4 | - Updated: 2023-03-27 5 | 6 | ## Overview 7 | 8 | This is a proposal for a new building block for Dapr to allow developers to leverage cryptography in a SAFE and consistent way. Goal is to expose an API that allows developers to ask Dapr to perform operations such as encrypting and decrypting messages, and calculating and verifying digital signatures. 9 | 10 | ### Business problem 11 | 12 | Modern applications make extensive use of cryptography, which, when implemented correctly, can make solutions safer even in case data is compromised. Even more, in certain cases the use of crypto is required to comply with industry regulations (think banking) or even with legal requirements (GDPR). However, leveraging cryptography is hard: developers need to pick the right algorithms and options, and need to learn the proper way to manage and protect keys. Additionally, there are operational complexities when teams want to limit who has access to cryptographic key material. 13 | 14 | Organizations have increasingly started to leverage tools and services to perform cryptographic operations outside of applications. Examples include services such as Azure Key Vault, AWS KMS, Google Cloud KMS, etc. Customers may also use on-prem HSM products like Thales Luna. While those products/services perform the same or very similar operations, their APIs are very different. 15 | 16 | This is an area where Dapr can help. Just like we're offering an abstraction on top of secret stores, we can offer an abstraction layer on top key vaults. 17 | 18 | ### Solution overview 19 | 20 | Benefits include: 21 | 22 | - Making it easier for developers to perform cryptographic operations in a safe way. Dapr provides safeguards against using unsafe algorithms, or using algorithms with unsafe options. 23 | - Keeping keys outside of applications. Applications never see key material, but can request the vault to perform operations with the keys. 24 | - Allowing greater separation of concerns. By using external vaults, only authorized teams can access private/shared key materials. 25 | - Simplify key management and key rotation. Keys are managed in the vault and outside of the application, and they can be rotated without needing the developers to be involved (or even without restarting the apps). 26 | - Enabling better audit logging to monitor when operations are performed with keys in the vault. 27 | 28 | ### APIs: High-level vs subtle 29 | 30 | The building block features 2 kinds of operations: 31 | 32 | - Low-level or "subtle" (term frequently used to indicate low-level crypto operations; one example is how in browsers, low-level operations are in the [`crypto.subtle`](https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto) object). These offer developers full control over the schemes that are used, allowing them to specify parameters, keys, modes of operations, etc. 33 | As the "subtle" name implies, using these operations requires a certain level of understanding about what they do and how to use them safely; they are not meant to be consumed by the "general public". 34 | Dapr offers these low-level operations for two reasons: 35 | 1. Just like it is Dapr's value proposition to, for example, offer a consistent API abstracting various state stores, we will offer a consistent API to interface with key vaults. It allows using key vaults from multiple cloud providers, as well as using keys stored as Kubernetes secrets, all while interfacing with the same API. 36 | 2. Developers working on applications that require interacting with existing and/or external solutions need these lower-level APIs to be able to maintain compatibility with them. 37 | - High-level operations. At launch, these will cover data encryption and decryption only. 38 | When using these higher-leel methods, Dapr offers a great level of abstraction. Developers just need to provide a key (symmetric or asymmetric) and can then use Dapr to encrypt/decrypt data without having to worry about anything else. Dapr will choose the best ciphers and modes of operations, offering access to an encryption scheme that is secure and flexible. 39 | 40 | ### Low-level APIs 41 | 42 | The new building block will feature 7 low-level APIs: 43 | 44 | - `encrypt`: encrypts arbitrary data using a key stored in the vault. It supports symmetric and asymmetric ciphers, depending on the type of key in use (and the types of keys supported by the vault). 45 | - `decrypt`: decrypts arbitrary data, performing the opposite of what `/encrypt` does. 46 | - `wrapkey`: wraps keys using other keys stored in the vault. This is exactly like encrypting data, but it expects inputs to be formatted as keys (for example formatted as JSON Web Key) and it exposes additional algorithms not available when encrypting general data (like AES-KW) 47 | - `unwrapkey`: un-wraps (decrypts) keys, performing the opposite of what `/wrap` does 48 | - `sign`: signs an arbitrary message using an asymmetric key stored in the vault (we could also consider offering HMAC here, using symmetric keys, although not widely supported by the vault services) 49 | - `verify`: verifies a digital signature over an arbitrary message, using an asymmetric key stored in the vault (same: we may be able to offer HMAC too) 50 | - `getkey`: this can be used only with asymmetric keys stored in the vault, and returns the public part of the key 51 | 52 | ### High-level APIs 53 | 54 | The high-level APIs at launch will only include support for **encrypting** and **decrypting** messages (of arbitrary length), using the [Dapr encryption scheme](#dapr-encryption-scheme-daprioencv1). 55 | 56 | Although these APIs will be available over both gRPC and HTTP, the gRPC implementation is **strongly** preferred, since it allows encrypting/decrypting data as a stream. The HTTP implementation requires keeping the entire message in memory, both as plaintext and ciphertext (a limitation in the HTTP protocol itself we cannot work around), which is not desirable unless users are encrypting very small files. 57 | 58 | ### Components 59 | 60 | Different components will be developed to perform those operations on supported backends such as the products/services listed above. Dapr would "translate" these calls into whatever format the backends require. Dapr never sees the private/shared keys, which remain safely stored inside the vaults. 61 | 62 | Additionally, we will offer a "local" crypto component where keys are stored as Kubernetes secrets and cryptographic operations are performed within the Dapr sidecar. Although this is not as secure as using an external key vault, it still offers some benefits such as using standardized APIs and separation of concerns/roles with regards to key management. 63 | 64 | Algorithms available will depend on what the backend vaults support, but in general developers should always find AES (encrypt/decrypt only) and RSA; when supported, we can offer also ChaCha20-Poly1305 (encrypt/decrypt only) and ECC with ECDSA or EdDSA (sign/verify only). 65 | 66 | ## Related Items 67 | 68 | Previous proposal as GitHub issue: dapr/dapr#4508 69 | 70 | ## Data flow: runtime and components 71 | 72 | ![data flow](./resources/20230327-RCBS-Crypto-building-block/data-flow.png) 73 | 74 | ## Dapr encryption scheme: dapr.io/enc/v1 75 | 76 | In the first version of the building block, we define 2 higher level operations to encrypt and decrypt data, in addition to low-level operations. 77 | 78 | > **Sources:** The encryption scheme that Dapr uses is heavily inspired by the [Tink wire format](https://developers.google.com/tink/wire-format) (from the Tink library maintained by Google), as well as by Filippo Valsorda's [age](https://age-encryption.org/v1), and Minio's [DARE](https://github.com/minio/sio). 79 | 80 | The **Dapr encryption scheme** is optimized for processing data as a stream. Data is chunked into multiple parts which are encrypted independently. This allows us to return data to callers as a stream, even when decrypting messages, being confident that we are not flushing unverified data to the client. 81 | 82 | ### Key 83 | 84 | Each message is encrypted with a 256-bit symmetric **File Key (FK)** that is randomly generated by Dapr for each new message. The key must be generated as 32 byte of output from a CSPRNG (such as Go's `crypto/rand.Reader`) and must not be reused for other files. 85 | 86 | The FK is wrapped using a key stored in a key vault (**Key Encryption Key (KEK)**) by Dapr. The result of the wrapping operation is the **Wrapped File Key (WFK)**. The algorithm used depends on the type of the KEK as well as the algorithms supported by the component: in order of preference: 87 | 88 | - For symmetric keys: 89 | - AES-KW with 256-bit keys ([RFC 3394](https://www.rfc-editor.org/rfc/rfc3394.html)): `A256KW` 90 | - Because the File Key is 256-bit long, only 256-bit wrapping keys can be used 91 | - AES-CBC with keys 128-bit, 192-bit, and 256-bit: `A128CBC-NOPAD`, `A192CBC-NOPAD`, `A256CBC-NOPAD` 92 | - These don't use PKCS#7 padding because the File Key is 256-bit so it's a multiple of the AES block size. 93 | - For RSA keys: 94 | - RSA OAEP with SHA-256: `RSA-OAEP-256` 95 | - Dapr doesn't impose limitations on the size of the key, and any key bigger than 1024 bits should work; however, 4096-bit keys are strongly recommended. 96 | 97 | 98 | > In the future, we should explore how to add support for elliptic curve cryptography, for example P-256/P-384/P-521 or Curve25519, which requires performing a static ECDH key agreement. 99 | 100 | ### Ciphertext format 101 | 102 | The ciphertext is formatted as: 103 | 104 | ```text 105 | header || binary payload 106 | ``` 107 | 108 | ### Header 109 | 110 | The **header** is human-readable and contains 3 items, each terminated by a line feed (0x0A) character: 111 | 112 | 1. Name and version of the encryption scheme used. Currently, this is always `dapr.io/enc/v1` 113 | 2. The manifest, which is a JSON object. 114 | 3. The MAC for the header, base64-encoded 115 | 116 | > Base64 encoding follows [RFC 4648 §4](https://datatracker.ietf.org/doc/html/rfc4648#section-4) ("standard" format, with padding included but optional when decoding) 117 | 118 | ```text 119 | dapr.io/enc/v1 120 | {"k":"mykey","kw":1,"wfk":"hGYjwDpWEXEymSTFZ95zgX8krElb3Gqyls67R8zJA3k=","cph":1,"np":"Y3J5cHRvIQ=="} 121 | pBDKLrhAWL7IAvDKBV/v7lmbTG6AEZbf3srUN0Pnn30= 122 | ``` 123 | 124 | #### Manifest 125 | 126 | The second line in the header is the **manifest**, which is a compact JSON object. 127 | 128 | Its corresponding Go struct is: 129 | 130 | ```go 131 | type Manifest struct { 132 | // Name of the key that can be used to decrypt the message. 133 | // This is optional, and if specified can be in the format `key` or `key/version`. 134 | KeyName string `json:"k,omitempty"` 135 | // ID of the wrapping algorithm used. 136 | // 0x01 = A256KW 137 | // 0x02 = A128CBC-NOPAD 138 | // 0x03 = A192CBC-NOPAD 139 | // 0x04 = A256CBC-NOPAD 140 | // 0x05 = RSA-OAEP-256 141 | KeyWrappingAlgorithm int `json:"kw"` 142 | // The Wrapped File Key 143 | WFK []byte `json:"wfk"` 144 | // ID of the cipher used. 145 | // 0x01 = AES-GCM 146 | // 0x02 = ChaCha20-Poly1305 147 | Cipher int `json:"cph"` 148 | // Random sequence of 7 bytes generated by a CSPRNG 149 | NoncePrefix []byte `json:"np"` 150 | } 151 | ``` 152 | 153 | - **`KeyName`** is the name of the key that can be used to decrypt the message. Usually this is the same as the name of the key used to encrypt the message, but when asymmetric ciphers are used, it could be different. Including a `KeyName` in the manifest is not required, but when i'ts present, it's used as the default value for the key name while decrypting the document (however, users can override this value by passing a custom one while decrypting the document). 154 | - **`Cipher`** indicates the cipher used to encrypt the actual data, and it must be an [AEAD](https://en.wikipedia.org/wiki/Authenticated_encryption#Authenticated_encryption_with_associated_data_(AEAD)) symmetric cipher. 155 | - Dapr will choose AES-GCM as cipher by default. 156 | - ChaCha20-Poly1305 is offered as an option for users that work with hardware that doesn't support AES-NI (such as Raspberry Pi), and needs to be enabled manually. 157 | - In the future, we can support other authenticated ciphers such as AES-CBC with HMAC-SHA256. 158 | 159 | #### MAC 160 | 161 | The third and final line is the MAC for the header, which is computed with HMAC-SHA-256 over the previous 2 lines (including the final newline character) with a key that is derived from the (plain-text) File Key with HKDF-SHA-256: 162 | 163 | ```text 164 | mac-key = HKDF-SHA-256(ikm = file key, salt = empty, info = "header") 165 | MAC = HMAC-SHA-256(key = mac-key, message = first 2 lines of the header) 166 | ``` 167 | 168 | > HKDF-SHA-256 is a key derivation function based on HMAC with SHA-256. See [RFC 5869 ("HMAC-based Extract-and-Expand Key Derivation Function (HKDF)")](https://www.rfc-editor.org/rfc/rfc5869.html). Being based on HMAC, it's not vulnerable to length-extension attacks, so we do not consider necessary using SHA-512 and truncating the output to 256-bits. 169 | 170 | Note that there's one newline character (0x0A) at the end of the MAC, which concludes the header. 171 | 172 | > Because each JSON encoder could produce a slightly different output, when verifying the manifest the MAC should be computed on the exact manifest string as included in the header. Verifiers should not re-encode the raw message as JSON. 173 | 174 | ### Binary payload 175 | 176 | The binary payload begins immediately after the header (after the 3rd newline character) and it includes the each segment of data encrypted: 177 | 178 | ```text 179 | segment_0 || segment_1 || ... || segment_k 180 | ``` 181 | 182 | ### Segments 183 | 184 | The plaintext is chunked into segments 64KB (65,536 bytes) each; the last segment may be shorter. Segments must never be empty, unless the entire file is empty. 185 | 186 | > Because segments are 64KB each, and we can have up to 2^32 segments, the maximum size of the encrypted message is 256TB. 187 | 188 | Each segment of plaintext is encrypted independently and stored together with its authentication tag: 189 | 190 | ```text 191 | encrypted chunk || tag 192 | ``` 193 | 194 | Segments are encrypted with a **Payload Key (PK)** that is derived from the (plain-text) File Key and the nonce prefix: 195 | 196 | ```text 197 | payload-key = HKDF-SHA-256(ikm = file key, salt = nonce prefix, info = "payload") 198 | ``` 199 | 200 | Each segment is encrypted using a different 12-byte nonce: 201 | 202 | ```text 203 | nonce_prefix || i || last_segment 204 | ``` 205 | 206 | Where: 207 | 208 | - `nonce_prefix` (7 bytes) is the nonce prefix from the header 209 | - `i` (4 bytes) is the sequence number, as a 32-bit unsigned integer counter, encoded as big-endian. The first segment has sequence number 0, and it increases. 210 | - `last_segment` (1 byte) is `0x01` if this is the last segment, or `0x00` otherwise 211 | 212 | ## Components 213 | 214 | Components in dapr/components-contrib implement low-level primitives only, while all higher-level operations are performed by the runtime, so they are executed in a consistent way across all backends/services. This is because the job of the components is limited to actually interacting with the key vaults, and everything else is best handled by the runtime. 215 | 216 | Components are to be placed in the `crypto` folder and must implement the `SubtleCrypto` interface: 217 | 218 | ```go 219 | import "github.com/lestrrat-go/jwx/v2/jwk" 220 | 221 | // SubtleCrypto offers an interface to perform low-level ("subtle") cryptographic operations with keys stored in the vault 222 | type SubtleCrypto interface { 223 | // GetKey returns the public part of a key stored in the vault 224 | // This method returns an error if the key is symmetric 225 | GetKey( 226 | // Context that can be used to cancel the running operation 227 | ctx context.Context, 228 | // Name (or name/version) of the key to use in the key vault 229 | key string, 230 | ) ( 231 | // Object containing the public key 232 | pubKey jwk.Key, 233 | // Error 234 | err error, 235 | ) 236 | 237 | // Encrypt a small message and returns the ciphertext 238 | Encrypt( 239 | // Context that can be used to cancel the running operation 240 | ctx context.Context, 241 | // Input plaintext 242 | plaintext []byte, 243 | // Encryption algorithm to use 244 | algorithm string, 245 | // Name (or name/version) of the key to use in the key vault 246 | key string, 247 | // Nonce / initialization vector 248 | // Ignored with asymmetric ciphers 249 | nonce []byte, 250 | // Associated Data when using AEAD ciphers 251 | // Optional, can be nil 252 | associatedData []byte, 253 | ) ( 254 | // Encrypted ciphertext 255 | ciphertext []byte, 256 | // Authentication tag 257 | // This is nil when not using an authenticated cipher 258 | tag []byte, 259 | // Error 260 | err error, 261 | ) 262 | 263 | // Decrypt a small message and returns the plaintext 264 | Decrypt( 265 | // Context that can be used to cancel the running operation 266 | ctx context.Context, 267 | // Input ciphertext 268 | ciphertext []byte, 269 | // Encryption algorithm to use 270 | algorithm string, 271 | // Name (or name/version) of the key to use in the key vault 272 | key string, 273 | // Nonce / initialization vector 274 | // Ignored with asymmetric ciphers 275 | nonce []byte, 276 | // Authentication tag 277 | // Ignored when not using an authenticated cipher 278 | tag []byte, 279 | // Associated Data when using AEAD ciphers 280 | // Optional, can be nil 281 | associatedData []byte, 282 | ) ( 283 | // Decrypted plaintext 284 | plaintext []byte, 285 | // Error 286 | err error, 287 | ) 288 | 289 | // WrapKey wraps a key 290 | WrapKey( 291 | // Context that can be used to cancel the running operation 292 | ctx context.Context, 293 | // Key to wrap as jwk.Key object 294 | plaintextKey jwk.Key, 295 | // Encryption algorithm to use 296 | algorithm string, 297 | // Name (or name/version) of the key to use in the key vault 298 | key string, 299 | // Nonce / initialization vector 300 | // Ignored with asymmetric ciphers 301 | nonce []byte, 302 | // Associated Data when using AEAD ciphers 303 | // Optional, can be nil 304 | associatedData []byte, 305 | ) ( 306 | // Wrapped key 307 | wrappedKey []byte, 308 | // Authentication tag 309 | // This is nil when not using an authenticated cipher 310 | tag []byte, 311 | // Error 312 | err error, 313 | ) 314 | 315 | // UnwrapKey unwraps a key 316 | UnwrapKey( 317 | // Context that can be used to cancel the running operation 318 | ctx context.Context, 319 | // Wrapped key 320 | wrappedKey []byte, 321 | // Encryption algorithm to use 322 | algorithm string, 323 | // Name (or name/version) of the key to use in the key vault 324 | key string, 325 | // Nonce / initialization vector 326 | // Ignored with asymmetric ciphers 327 | nonce []byte, 328 | // Authentication tag 329 | // Ignored when not using an authenticated cipher 330 | tag []byte, 331 | // Associated Data when using AEAD ciphers 332 | // Optional, can be nil 333 | associatedData []byte, 334 | ) ( 335 | // Plaintext key 336 | plaintextKey jwk.Key, 337 | // Error 338 | err error, 339 | ) 340 | 341 | // Sign a digest 342 | Sign( 343 | // Context that can be used to cancel the running operation 344 | ctx context.Context, 345 | // Digest to sign 346 | digest []byte, 347 | // Signing algorithm to use 348 | algorithm string, 349 | // Name (or name/version) of the key to use in the key vault 350 | // The key must be asymmetric 351 | key string, 352 | ) ( 353 | // Signature that was computed 354 | signature []byte, 355 | // Error 356 | err error, 357 | ) 358 | 359 | // Verify a signature 360 | Verify( 361 | // Context that can be used to cancel the running operation 362 | ctx context.Context, 363 | // Digest of the message 364 | digest []byte, 365 | // Signature to verify 366 | signature []byte, 367 | // Signing algorithm to use 368 | algorithm string, 369 | // Name (or name/version) of the key to use in the key vault 370 | // The key must be asymmetric 371 | key string, 372 | ) ( 373 | // True if the signature is valid 374 | valid bool, 375 | // Error 376 | err error, 377 | ) 378 | } 379 | ``` 380 | 381 | A few notes about all methods above: 382 | 383 | 1. Keys are passed as `jwk.Key` objects, from the (excellent) [lestrrat-go/jwx library](https://pkg.go.dev/github.com/lestrrat-go/jwx/v2) 384 | 1. The `algorithm` should be represented as constant as defined by [RFC 7518 ("JSON Web Algorithms (JWA)")](https://www.rfc-editor.org/rfc/rfc7518.html). For the most part, Dapr components should not try to parse the value submitted by the user (unless the component is the "local" one that performs crypto operations directly), and pass whatever value directly to the key vault. 385 | 1. The `key` parameter can contain a version if keys can be versioned in the vault. The format should be `name/version`. If no version is specified, it's assumed to be the latest. 386 | 387 | Notes on `WrapKey` and `UnwrapKey`: 388 | 389 | 1. If the key need to be encoded (common with asymmetric keys), it needs to be encoded before being passed to the component. For example, in the runtime, RSA keys may be represented in a [`rsa.PrivateKey` object](https://pkg.go.dev/crypto/rsa#PrivateKey), and need to be encoded in PKCS#1 format. Symmetric keys can be passed as-is, as they are normally stored in a byte slice already. 390 | 1. `WrapKey` and `UnwrapKey` can be implemented on top of `Encrypt` and `Decrypt` if the underlying key vault does not have a special operation for key wrapping/unwrapping. 391 | 392 | Notes on `Encrypt` and `Verify`: 393 | 394 | 1. When using an asymmetric key, these operations can be performed using the public key without hitting the key vault. However, for consistency and to ensure that we always use the last version of the key, components should always perform them in the vault. Exception could be if the key has a specific version, in which case components may opt to download the public key, cache it, and perform the operation locally. 395 | 396 | Notes on `Sign` and `Verify`: 397 | 398 | 1. Certain algorithms (currently `Ed25519`) do not operate on a message digest's, but rather on the message itself. 399 | 400 | ## gRPC APIs 401 | 402 | In the Dapr gRPC APIs, we are extending the `runtime.v1.Dapr` service to add new methods: 403 | 404 | > Note: APIs will have "Alpha1" added while in preview 405 | 406 | ```proto 407 | // (Existing Dapr service) 408 | service Dapr { 409 | // SubtleGetKey returns the public part of an asymmetric key stored in the vault. 410 | rpc SubtleGetKey(SubtleGetKeyRequest) returns (SubtleGetKeyResponse); 411 | 412 | // SubtleEncrypt encrypts a small message using a key stored in the vault. 413 | rpc SubtleEncrypt(SubtleEncryptRequest) returns (SubtleEncryptResponse); 414 | 415 | // SubtleDecrypt decrypts a small message using a key stored in the vault. 416 | rpc SubtleDecrypt(SubtleDecryptRequest) returns (SubtleDecryptResponse); 417 | 418 | // SubtleWrapKey wraps a key using a key stored in the vault. 419 | rpc SubtleWrapKey(SubtleWrapKeyRequest) returns (SubtleWrapKeyResponse); 420 | 421 | // SubtleUnwrapKey unwraps a key using a key stored in the vault. 422 | rpc SubtleUnwrapKey(SubtleUnwrapKeyRequest) returns (SubtleUnwrapKeyResponse); 423 | 424 | // SubtleSign signs a message using a key stored in the vault. 425 | rpc SubtleSign(SubtleSignRequest) returns (SubtleSignResponse); 426 | 427 | // SubtleVerify verifies the signature of a message using a key stored in the vault. 428 | rpc SubtleVerify(SubtleVerifyRequest) returns (SubtleVerifyResponse); 429 | 430 | // Encrypt encrypts a message using the Dapr encryption scheme and a key stored in the vault. 431 | rpc Encrypt(stream EncryptRequest) returns (stream EncryptResponse); 432 | 433 | // Decrypt decrypts a message using the Dapr encryption scheme and a key stored in the vault. 434 | rpc Decrypt(stream DecryptRequest) returns (stream DecryptResponse); 435 | } 436 | 437 | // rpc SubtleGetKey(SubtleGetKeyRequest) returns (SubtleGetKeyResponse); 438 | 439 | // SubtleGetKeyRequest is the request object for SubtleGetKey. 440 | message SubtleGetKeyRequest { 441 | enum KeyFormat { 442 | // PEM (SPKI) (default) 443 | PEM = 0; 444 | // JSON (JSON Web Key) as string 445 | JSON = 1; 446 | } 447 | 448 | // Name of the component 449 | string component_name = 1 [json_name="component"]; 450 | // Name (or name/version) of the key to use in the key vault 451 | string name = 2; 452 | // Response format 453 | KeyFormat format = 3; 454 | } 455 | 456 | // SubtleGetKeyResponse is the response for SubtleGetKey. 457 | message SubtleGetKeyResponse { 458 | // Name (or name/version) of the key. 459 | // This is returned as response too in case there is a version. 460 | string name = 1; 461 | // Public key, encoded in the requested format 462 | string public_key = 2 [json_name="publicKey"]; 463 | } 464 | 465 | // rpc SubtleEncrypt(SubtleEncryptRequest) returns (SubtleEncryptResponse); 466 | 467 | // SubtleEncryptRequest is the request for SubtleEncrypt. 468 | message SubtleEncryptRequest { 469 | // Name of the component 470 | string component_name = 1 [json_name="component"]; 471 | // Message to encrypt. 472 | bytes plaintext = 2; 473 | // Algorithm to use, as in the JWA standard. 474 | string algorithm = 3; 475 | // Name (or name/version) of the key. 476 | string key = 4; 477 | // Nonce / initialization vector. 478 | // Ignored with asymmetric ciphers. 479 | bytes nonce = 5; 480 | // Associated Data when using AEAD ciphers (optional). 481 | bytes associated_data = 6 [json_name="associatedData"]; 482 | } 483 | 484 | // SubtleEncryptResponse is the response for SubtleEncrypt. 485 | message SubtleEncryptResponse { 486 | // Encrypted ciphertext. 487 | bytes ciphertext = 1; 488 | // Authentication tag. 489 | // This is nil when not using an authenticated cipher. 490 | bytes tag = 2; 491 | } 492 | 493 | // rpc SubtleDecrypt(SubtleDecryptRequest) returns (SubtleDecryptResponse); 494 | 495 | // SubtleDecryptRequest is the request for SubtleDecrypt. 496 | message SubtleDecryptRequest { 497 | // Name of the component 498 | string component_name = 1 [json_name="component"]; 499 | // Message to decrypt. 500 | bytes ciphertext = 2; 501 | // Algorithm to use, as in the JWA standard. 502 | string algorithm = 3; 503 | // Name (or name/version) of the key. 504 | string key = 4; 505 | // Nonce / initialization vector. 506 | // Ignored with asymmetric ciphers. 507 | bytes nonce = 5; 508 | // Authentication tag. 509 | // This is nil when not using an authenticated cipher. 510 | bytes tag = 6; 511 | // Associated Data when using AEAD ciphers (optional). 512 | bytes associated_data = 7 [json_name="associatedData"]; 513 | } 514 | 515 | // SubtleDecryptResponse is the response for SubtleDecrypt. 516 | message SubtleDecryptResponse { 517 | // Decrypted plaintext. 518 | bytes plaintext = 1; 519 | } 520 | 521 | // rpc SubtleWrapKey(SubtleWrapKeyRequest) returns (SubtleWrapKeyResponse); 522 | 523 | // SubtleWrapKeyRequest is the request for SubtleWrapKey. 524 | message SubtleWrapKeyRequest { 525 | // Name of the component 526 | string component_name = 1 [json_name="component"]; 527 | // Key to wrap 528 | bytes plaintext_key = 2 [json_name="plaintextKey"]; 529 | // Algorithm to use, as in the JWA standard. 530 | string algorithm = 3; 531 | // Name (or name/version) of the key. 532 | string key = 4; 533 | // Nonce / initialization vector. 534 | // Ignored with asymmetric ciphers. 535 | bytes nonce = 5; 536 | // Associated Data when using AEAD ciphers (optional). 537 | bytes associated_data = 6 [json_name="associatedData"]; 538 | } 539 | 540 | // SubtleWrapKeyResponse is the response for SubtleWrapKey. 541 | message SubtleWrapKeyResponse { 542 | // Wrapped key. 543 | bytes wrapped_key = 1 [json_name="wrappedKey"]; 544 | // Authentication tag. 545 | // This is nil when not using an authenticated cipher. 546 | bytes tag = 2; 547 | } 548 | 549 | // rpc SubtleUnwrapKey(SubtleUnwrapKeyRequest) returns (SubtleUnwrapKeyResponse); 550 | 551 | // SubtleUnwrapKeyRequest is the request for SubtleUnwrapKey. 552 | message SubtleUnwrapKeyRequest { 553 | // Name of the component 554 | string component_name = 1 [json_name="component"]; 555 | // Wrapped key. 556 | bytes wrapped_key = 2 [json_name="wrappedKey"]; 557 | // Algorithm to use, as in the JWA standard. 558 | string algorithm = 3; 559 | // Name (or name/version) of the key. 560 | string key = 4; 561 | // Nonce / initialization vector. 562 | // Ignored with asymmetric ciphers. 563 | bytes nonce = 5; 564 | // Authentication tag. 565 | // This is nil when not using an authenticated cipher. 566 | bytes tag = 6; 567 | // Associated Data when using AEAD ciphers (optional). 568 | bytes associated_data = 7 [json_name="associatedData"]; 569 | } 570 | 571 | // SubtleUnwrapKeyResponse is the response for SubtleUnwrapKey. 572 | message SubtleUnwrapKeyResponse { 573 | // Key in plaintext 574 | bytes plaintext_key = 1 [json_name="plaintextKey"]; 575 | } 576 | 577 | // rpc SubtleSign(SubtleSignRequest) returns (SubtleSignResponse); 578 | 579 | // SubtleSignRequest is the request for SubtleSign. 580 | message SubtleSignRequest { 581 | // Name of the component 582 | string component_name = 1 [json_name="component"]; 583 | // Digest to sign. 584 | bytes digest = 2; 585 | // Algorithm to use, as in the JWA standard. 586 | string algorithm = 3; 587 | // Name (or name/version) of the key. 588 | string key = 4; 589 | } 590 | 591 | // SubtleSignResponse is the response for SubtleSign. 592 | message SubtleSignResponse { 593 | // The signature that was computed 594 | bytes signature = 1; 595 | } 596 | 597 | // rpc SubtleVerify(SubtleVerifyRequest) returns (SubtleVerifyResponse); 598 | 599 | // SubtleVerifyRequest is the request for SubtleVerify. 600 | message SubtleVerifyRequest { 601 | // Name of the component 602 | string component_name = 1 [json_name="component"]; 603 | // Digest of the message. 604 | bytes digest = 2; 605 | // Signature to verify. 606 | bytes signature = 3; 607 | // Algorithm to use, as in the JWA standard. 608 | string algorithm = 4; 609 | // Name (or name/version) of the key. 610 | string key = 5; 611 | } 612 | 613 | // SubtleVerifyResponse is the response for SubtleVerify. 614 | message SubtleVerifyResponse { 615 | // True if the signature is valid. 616 | bool valid = 1; 617 | } 618 | 619 | // rpc Encrypt(stream EncryptRequest) returns (stream EncryptResponse); 620 | 621 | message EncryptRequest { 622 | // Request details. Must be present in the first message only. 623 | EncryptRequestOptions options = 1; 624 | // Chunk of data of arbitrary size. 625 | common.v1.StreamPayload payload = 2; 626 | } 627 | 628 | message EncryptRequestOptions { 629 | // Name of the component 630 | string component_name = 1 [json_name="component"]; 631 | // Name (or name/version) of the key. 632 | string key = 2; 633 | // Force algorithm to use to encrypt data: "aes-gcm" or "chacha20-poly1305" (optional) 634 | string algorithm = 10; 635 | // If true, the encrypted document does not contain a key reference. 636 | // In that case, calls to the Decrypt method must provide a key reference (name or name/version). 637 | // Defaults to false. 638 | bool omit_decryption_key_name = 11 [json_name="omitDecryptionKeyName"]; 639 | // Key reference to embed in the encrypted document (name or name/version). 640 | // This is helpful if the reference of the key used to decrypt the document is different from the one used to encrypt it. 641 | // If unset, uses the reference of the key used to encrypt the document (this is the default behavior). 642 | // This option is ignored if omit_decryption_key_name is true. 643 | string decryption_key = 12 [json_name="decryptionKey"]; 644 | } 645 | 646 | message EncryptResponse { 647 | // Chunk of data. 648 | common.v1.StreamPayload payload = 1; 649 | } 650 | 651 | // rpc Decrypt(stream DecryptRequest) returns (stream DecryptResponse); 652 | 653 | message DecryptRequest { 654 | // Request details. Must be present in the first message only. 655 | DecryptRequestOptions options = 1; 656 | // Chunk of data of arbitrary size. 657 | common.v1.StreamPayload payload = 2; 658 | } 659 | 660 | message DecryptRequestOptions { 661 | // Name of the component 662 | string component_name = 1 [json_name="component"]; 663 | // Name (or name/version) of the key to decrypt the message. 664 | // Overrides any key reference included in the message if present. 665 | // This is required if the message doesn't include a key reference (i.e. was created with omit_decryption_key_name set to true). 666 | string key = 12; 667 | } 668 | 669 | message DecryptResponse { 670 | // Chunk of data. 671 | common.v1.StreamPayload payload = 1; 672 | } 673 | ``` 674 | 675 | > For the `common.v1.StreamPayload` message, see [dapr/dapr#5170](https://github.com/dapr/dapr/pull/5170) 676 | 677 | The `Encrypt` and `Decrypt` methods are stream-based. Dapr will read from the client until it has sufficient data, and will then send back the encrypted/decrypted data to the client. Clients must thus both send data to the RPC and listen for incoming messages. SDKs can offer to consumer methods to read the data as a stream (e.g. in Go, they accept an `io.Reader` and return an `io.Reader`) 678 | 679 | ## HTTP APIs 680 | 681 | ### Low-level 682 | 683 | The low-level HTTP APIs are developed in a way that is the exact "port" of the gRPC "subtle" APIs, and the contents of the request and response bodies match exactly the fields in the gRPC APIs (except for the component name in the URL). 684 | 685 | List of HTTP endpoints and the corresponding gRPC method: 686 | 687 | - `POST /v1.0/subtlecrypto/[component]/getkey` -> SubtleGetKey 688 | - `POST /v1.0/subtlecrypto/[component]/encrypt` -> SubtleEncrypt 689 | - `POST /v1.0/subtlecrypto/[component]/decrypt` -> SubtleDecrypt 690 | - `POST /v1.0/subtlecrypto/[component]/wrapkey` -> SubtleWrapKey 691 | - `POST /v1.0/subtlecrypto/[component]/unwrapkey` -> SubtleUnwrapKey 692 | - `POST /v1.0/subtlecrypto/[component]/sign` -> SubtleSign 693 | - `POST /v1.0/subtlecrypto/[component]/verify` -> SubtleVerify 694 | 695 | > Note: URL will begin with `/v1.0-alpha1` while in preview 696 | 697 | > These APIs are implemented as "Universal" APIs in Dapr, where the business logic is implemented in gRPC only, and the APIs are then exposed as HTTP using the Universal API wrapper. 698 | 699 | ### High-level 700 | 701 | For high-level APIs, we cannot use Universal APIs because we cannot perform bi-directional streaming with HTTP. 702 | 703 | As mentioned earlier, using HTTP for the high-level APIs is **highly inefficient** and users will be strongly advised against doing that outside of development or testing scenarios. In fact, while the Dapr encryption scheme is designed for streaming, that is not possible when using HTTP: first, the Dapr sidecar needs to receive the entire message (e.g. plaintext while encrypting), and only after that can begin responding to the caller; this means the Dapr sidecar needs to keep the entire message in-memory. 704 | 705 | List of high-level HTTP endpoints: 706 | 707 | - `PUT /v1.0/crypto/[component]/encrypt` 708 | - Query-string arguments: 709 | - `key` (required): name–or name/version (URL-encoded)–of the key 710 | - `algorithm` (optional): `aes-gcm` (default) or `chacha20-poly1305` 711 | - Body: the plain-text message to encrypt (in "raw format", e.g. not using multipart/form-data) 712 | - Response: the ciphertext (in "raw format") 713 | - `PUT /v1.0/crypto/[component]/decrypt` 714 | - Query-string arguments: 715 | - `key` (required): name–or name/version (URL-encoded)–of the key 716 | - Body: the ciphertext to decrypt (in "raw format", e.g. not using multipart/form-data) 717 | - Response: the plain-text message (in "raw format") 718 | 719 | > Note: URL will begin with `/v1.0-alpha1` while in preview 720 | 721 | > Note: the body is limited by Dapr's ["http-max-request-size" option](https://docs.dapr.io/operations/configuration/increase-request-size/). 722 | -------------------------------------------------------------------------------- /20230406-B-external-service-invocation.md: -------------------------------------------------------------------------------- 1 | # External Service Invocation 2 | 3 | * Author(s): Samantha Coyle (@sicoyle) 4 | * State: Ready for Implementation 5 | * Updated: 04/06/2023 6 | 7 | ## Overview 8 | 9 | This is a design proposal for the requested [external service invocation feature](https://github.com/dapr/dapr/issues/4549). 10 | 11 | The goal of this feature enhancement is to provide developers with a way to invoke any service of their choosing, 12 | using the existing building blocks provided by Dapr. 13 | 14 | ## Background 15 | 16 | ### Motivation 17 | We want Dapr users to be able to invoke external, 18 | non-Daprized services with ease and flexibility. 19 | 20 | ### Goals 21 | Implement a change into `dapr/dapr` that facilities a seamless Dapr UX to allow for external service invocation using existing building blocks and feature sets. 22 | 23 | ### Current Shortfalls 24 | Currently, we have the service invocation API that allows for Dapr users to use the invoke API on the Dapr instance. 25 | This provides many features as part of the service invocation building block such as HTTP & gPRC service invocation, 26 | service-to-service security, resiliency, observability, access control, namespace scoping, load balancing, and service discovery. 27 | However, the current implementation does not allow for external service invocations - which is a real bummer for many Dapr users. 28 | 29 | To remind everyone of the work around many Dapr users use, there is the HTTP binding. 30 | Dapr users can create an HTTP binding with their external URL specified, 31 | but this approach has many downfalls that yield a less-than-desirable developer experience. 32 | 33 | For additional background information, 34 | please refer to the [external service invocation feature request](https://github.com/dapr/dapr/issues/4549). 35 | 36 | ## Related Items 37 | 38 | ### Related proposals 39 | 40 | Formalizing the proposal here from [this issue](https://github.com/dapr/dapr/issues/4549). 41 | 42 | ## Expectations and alternatives 43 | 44 | * What is in scope for this proposal? 45 | Feature enhancement to enable external service invocation 46 | using the existing service invocation building block allowing service communication using HTTP protocol. 47 | 48 | * What is deliberately *not* in scope? 49 | gRPC invocation as well as additional authentication, to include OAuth2, 50 | is not within scope of this initial proposal and implementation. 51 | 52 | 53 | * What alternatives have been considered, and why do they not solve the problem? 54 | 1. Expanding the existing HTTP Binding. 55 | 2. Creation of another HTTP Binding explicitly dedicated to external service invocation keeping in mind the current pain points. 56 | 57 | Moving forward with the alternative approaches goes against the motivation and goal of this proposal, 58 | as Dapr users would be missing crucial service invocation features, 59 | be restricted on numerous avenues, 60 | and be forced to continue abiding by an awkwardly clunky workaround. 61 | Additional pros/cons may be found in the [linked issue's discussion](https://github.com/dapr/dapr/issues/4549#issuecomment-1414841151). 62 | 63 | * Are there any trade-offs being made? (space for time, for example) 64 | N/A 65 | 66 | * What advantages / disadvantages does this proposal have? 67 | This proposal allows service invocation to be enabled for non-Dapr endpoints. 68 | 69 | Pros: 70 | - Extends existing service invocation implementation. 71 | - Same feel as current user invocation process. 72 | - Can leverage existing service invocation features like resiliency, security practices, observability. 73 | - Leveraging a new CRD would keep our CRD setup less cluttered and easier to adjust and add to moving forward. 74 | - With a new CRD you can add/rm endpoints programmatically via kubectl. 75 | - Allows for user overrides such as base URL and related request information at invocation time. 76 | 77 | Cons: 78 | - Creation of an additional CRD, thus increasing the duplication of boilerplate code for its setup. 79 | - Need to know external base URL ahead of time to configure, but that may not always be easy for end users. 80 | - Would need to change the Dapr Operator to notify on edits for external endpoints. 81 | 82 | ## Implementation Details 83 | 84 | ### Design 85 | 86 | How will this work, technically? 87 | 88 | Allow configuration of pieces needed for external service invocation through creation of new CRD titled `HTTPEndpoint`. 89 | It is HTTP specific in it's `Kind`. 90 | This has benefits in being obvious upfront that it supports only `http`, 91 | and makes it to where we do not need `spec.allowed.protocols`. 92 | However, it would have the drawback of needing additional CRDs in the future for supporting other protocols such as `gRPC`. 93 | The sample `yaml` file snippet below represents the proposed configuration. 94 | 95 | ``` 96 | apiVersion: dapr.io/v1alpha1 97 | kind: HTTPEndpoint 98 | metadata: 99 | name: "github" 100 | spec: 101 | baseUrl: "http://api.github.com" 102 | headers: 103 | - name: "Accept-Language" 104 | value: "en-US" 105 | - name: "Content-Type" 106 | value: "application/json" 107 | - name: "Authorization" 108 | secretKeyRef: 109 | name: "my-secret" 110 | key: "mymetadataSecret" 111 | auth: 112 | secretStore: "my-secretstore" 113 | ``` 114 | 115 | Noteworthy caveat: 116 | If `Authorization` header is specified, 117 | then it is assumed that the [auth-scheme](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Authorization) prefix (ie token, basic, etc) 118 | is specified within the value for the `Authorization` header field. 119 | This allows for headers to match with the existing HTTP header schema, 120 | thus leading to a better user experience that is straightforward to use. 121 | 122 | Implementation for external service invocation will sit alongside the existing service invocation building block implementation with API changes to support external invocation. 123 | 124 | User facing changes include overriding the URL when calling Dapr for service invocation. 125 | Users will use the existing service invocation API, but instead of using an app ID, 126 | they can use an external URL and optionally overwrite fields at the time of invocation. 127 | 128 | To summarize, there would be two ways of working with external service invocations: 129 | 1. The URL format programatically. 130 | This allows for convenience and includes a single HTTP call. 131 | 2. HTTPEndpoint resource creation declaratively, 132 | where the `HTTPEndpoint.Name` would be used as the AppId in the existing service invocation URL. 133 | 134 | #### Examples 135 | 136 | 1. URL format overwritten: 137 | `http://localhost:${daprPort}/v1.0/invoke/http://api.github.com/method/` 138 | 139 | 2. HTTPEndpoint resource creation declaratively using the HTTPEndpoint resource definition above. 140 | `http://localhost:${daprPort}/v1.0/invoke/github/method/` 141 | 142 | 143 | ### Feature lifecycle outline 144 | 145 | * Compatability guarantees 146 | This feature is fully compatible with the existing service invocation API. 147 | 148 | * Deprecation / co-existence with existing functionality 149 | This feature will require support for external service invocations that will sit alongside and make changes to expand the existing service invocation API. 150 | 151 | * Feature flags 152 | N/A 153 | 154 | ### Acceptance Criteria 155 | 156 | How will success be measured? 157 | 158 | * Performance targets 159 | N/A 160 | 161 | * Compabitility requirements 162 | This feature will need to be fully compatible with existing service invocation API. 163 | In the case that a user adds an `HTTPEndpoint` with the same name as an AppId in the same namespace and performs service invocation, 164 | then the `HTTPEndpoint` will be invoked. 165 | Calls for service invocation will first check if the AppId matches an `HTTPEndpoint` CRD, 166 | and in the case that it does, then external service invocation will occur. 167 | 168 | * Metrics 169 | Existing service invocation tracing and metrics capabilities when calling external enpoints will be fully functional. 170 | 171 | ## Completion Checklist 172 | 173 | What changes or actions are required to make this proposal complete? 174 | 175 | * Code changes 176 | * Secret resolution 177 | * Tests added (e2e, unit) 178 | * SDK changes (if needed) 179 | * Documentation 180 | 181 | -------------------------------------------------------------------------------- /20230511-BCIRS-error-handling-codes.md: -------------------------------------------------------------------------------- 1 | # Dapr Error Handling/Codes 2 | 3 | * Author(s): Roberto J. Rojas 4 | * State: Draft 5 | * Updated: 5/11/2023 6 | 7 | ## Overview 8 | 9 | Across Dapr errors are surfaced for different conditions, without consistent messages, details of the error, standard formats, and no clear indication of what/where the error initiated. 10 | 11 | This makes troubleshooting and debugging quite difficult and requires a deep understanding of the parts of Dapr and how those parts interact with each other. 12 | 13 | To help with the issues raised above, it would be ideal if a solution could provide: 14 | - Greater details of errors that occured. 15 | - Error details in a structured format. 16 | - Consistency in the error details. 17 | - An indication where within the Dapr execution (Init, Runtime, Components, SDKs, etc...) the error occurred. 18 | 19 | ## Background 20 | 21 | ## Related Items 22 | 23 | ### Related proposals 24 | 25 | 26 | ### Related issues 27 | 28 | https://github.com/dapr/dapr/issues/6068 29 | 30 | ## Expectations and alternatives 31 | 32 | ## Implementation Details 33 | 34 | # Design 35 | 36 | ## Solution 37 | Utilize and follow the [gRPC Richer Error Model](https://grpc.io/docs/guides/error/#richer-error-model) and [Google API Errors Model in the Design Guide](https://cloud.google.com/apis/design/errors#error_model) 38 | 39 | ### Error Code Standard 40 | The [Google API Error Model](https://cloud.google.com/apis/design/errors#error_model) has the following Protobuf format: 41 | ```go 42 | package google.rpc; 43 | 44 | // The `Status` type defines a logical error model that is suitable for 45 | // different programming environments, including REST APIs and RPC APIs. 46 | message Status { 47 | // A simple error code that can be easily handled by the client. The 48 | // actual error code is defined by `google.rpc.Code`. 49 | int32 code = 1; 50 | 51 | // A developer-facing human-readable error message in English. It should 52 | // both explain the error and offer an actionable resolution to it. 53 | string message = 2; 54 | 55 | // Additional error information that the client code can use to handle 56 | // the error, such as retry info or a help link. 57 | repeated google.protobuf.Any details = 3; 58 | } 59 | ``` 60 | 61 | Here is one of the possible details that can be added to the above 62 | error structure. This is defined in the [error_details.proto Protobuf](https://github.com/googleapis/googleapis/blob/master/google/rpc/error_details.proto) 63 | 64 | ```go 65 | message ErrorInfo { 66 | // The reason of the error. This is a constant value that identifies the 67 | // proximate cause of the error. Error reasons are unique within a particular 68 | // domain of errors. This should be at most 63 characters and match a 69 | // regular expression of `[A-Z][A-Z0-9_]+[A-Z0-9]`, which represents 70 | // UPPER_SNAKE_CASE. 71 | string reason = 1; 72 | 73 | // The logical grouping to which the "reason" belongs. The error domain 74 | // is typically the registered service name of the tool or product that 75 | // generates the error. Example: "pubsub.googleapis.com". If the error is 76 | // generated by some common infrastructure, the error domain must be a 77 | // globally unique value that identifies the infrastructure. For Google API 78 | // infrastructure, the error domain is "googleapis.com". 79 | string domain = 2; 80 | 81 | // Additional structured details about this error. 82 | // 83 | // Keys should match /[a-zA-Z0-9-_]/ and be limited to 64 characters in 84 | // length. When identifying the current value of an exceeded limit, the units 85 | // should be contained in the key, not the value. For example, rather than 86 | // {"instanceLimit": "100/request"}, should be returned as, 87 | // {"instanceLimitPerRequest": "100"}, if the client exceeds the number of 88 | // instances that can be created in a single (batch) request. 89 | map metadata = 3; 90 | } 91 | ``` 92 | 93 | ### Error Status 94 | The properties of the **google.rpc.Status** will be populated as following: 95 | 96 | - **Code** - Protocol level error code. These could be either gRPC or HTTP error codes. See (gRPC Codes ProtoBuf)[https://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto] 97 | 98 | Example: "InvalidArgument Code = 3", "Internal Code = 13" 99 | 100 | - **Message** - Error message. 101 | - **Details** - A set of standard error payloads for error details. These list can be found in [Error Details](https://github.com/googleapis/googleapis/blob/master/google/rpc/error_details.proto) 102 | 103 | Example: "ErrorInfo", "ResourceInfo" 104 | 105 | 106 | 107 | Below is partial table of the Standard Error code provided by gRPC and how they map to HTTP error codes. The entire list can found in the following links: 108 | - [Google API Error Handling]https://cloud.google.com/apis/design/errors#handling_errors 109 | - (gRPC Codes ProtoBuf)[https://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto] 110 | 111 | 112 | |HTTP | gRPC | Description | 113 | |---- | ----------- | ----------- | 114 | |200 | OK | No error. | 115 | |400 | INVALID_ARGUMENT | Client specified an invalid argument. Check error message and error details for more information. | 116 | |400 | FAILED_PRECONDITION | Request can not be executed in the current system state, such as deleting a non-empty directory. | 117 | |400 | OUT_OF_RANGE | Client specified an invalid range. | 118 | |401 | UNAUTHENTICATED | Request not authenticated due to missing, invalid, or expired authorization credentials. | 119 | |403 | PERMISSION_DENIED | Client does not have sufficient permission. | 120 | |404 | NOT_FOUND | A specified resource is not found. | 121 | |409 | ABORTED | Concurrency conflict, such as read-modify-write conflict. | 122 | 123 | 124 | ### ErrorInfo (Required) 125 | The properties of the **type.googleapis.com/google.rpc.ErrorInfo** will be populated as following: 126 | 127 | - **Reason** - A combination of prefix from prefix of the table below plus the error condition code. 128 | 129 | Example: "DAPR_STATE_" + "ETAG_MISMATCH" 130 | 131 | - **Domain** - With the value `dapr.io`. 132 | 133 | - **Metadata** - A key/value map/dictionary data relevant to the error condition. 134 | 135 | `**Note:**` The metadata property **retriable** with a truthable value("true", "false", "True", "False", "TRUE", "FALSE", "1", "0") is required. 136 | 137 | ### ResourceInfo (Optional) 138 | The properties of the **type.googleapis.com/google.rpc.ResourceInfo** will be populated as following: 139 | 140 | - **ResourceType** - The building block type with version. 141 | 142 | Example: "state.redis/v1" 143 | 144 | - **ResourceName** - The component name. 145 | 146 | Example: "my-component-name" 147 | 148 | - **Owner** - The owner of the component. 149 | 150 | - **Description** - Resource descrpition or error details. 151 | 152 | 153 | ### Error Details Prefixes 154 | The following tables shows the propsosed error codes prefixes used in the **reason** for the **google.rpc.ErrorInfo** for various Dapr building blocks: 155 | 156 | 157 | **INIT** 158 | | Dapr Module | Prefix | 159 | | ----------- | ----------- | 160 | | CLI | DAPR_CLI_INIT_* | 161 | | Self-hosted | DAPR_SELF_HOSTED_INIT_* | 162 | | K8S | DAPR_K8S_INIT_* | 163 | | Invoke | DAPR_INVOKE_INIT_* | 164 | 165 | **RUNTIME** 166 | | Dapr Module | Prefix | 167 | | ----------- | ----------- | 168 | | CLI | DAPR_RUNTIME_CLI_* | 169 | | Self-hosted | DAPR_SELF_HOSTED_* | 170 | | dapr-2-dapr(gRPC) | DAPR_RUNTIME_GRPC_* | 171 | 172 | **COMPONENTS** 173 | | Dapr Module | Prefix | 174 | | ----------- | ----------- | 175 | | PubSub | DAPR_PUBSUB_* | 176 | | StateStore | DAPR_STATE_* | 177 | | Bindings | DAPR_BINDING_* | 178 | | SecretStore | DAPR_SECRET_* | 179 | | ConfigurationStore | DAPR_CONFIGURATION_* | 180 | | Lock | DAPR_LOCK_* | 181 | | NameResolution | DAPR_NAME_RESOLUTION_* | 182 | | Middleware | DAPR_MIDDLEWARE_*| 183 | 184 | 185 | The following snippet shows an error status returned due to a `ETAG_MISMATCH` error condition. The **reason** is populated with `PREFIX+ERROR_CONDITION`: 186 | 187 | ```json 188 | { 189 | "code": 3, 190 | "message": "possible etag mismatch. error from state store", 191 | "details": [ 192 | { 193 | "@type": "type.googleapis.com/google.rpc.ErrorInfo", 194 | "reason": "DAPR_STATE_ETAG_MISMATCH", 195 | "domain": "dapr.io", 196 | "metadata": { 197 | "key": "myapp||name" 198 | } 199 | }, 200 | { 201 | "@type": "type.googleapis.com/google.rpc.ResourceInfo", 202 | "resource_type": "state.redis/v1", 203 | "resource_name": "my-component", 204 | "owner": "", 205 | "description": "possible etag mismatch. error from state store" 206 | } 207 | ] 208 | } 209 | ``` 210 | 211 | 212 | ### Sample Code Snippet (Go) 213 | ```go 214 | import ( 215 | ... 216 | "google.golang.org/genproto/googleapis/rpc/errdetails" 217 | "google.golang.org/grpc/codes" 218 | "google.golang.org/grpc/status" 219 | ... 220 | ) 221 | ... 222 | if req.ETag != nil { 223 | ... 224 | ste := status.Newf(codes.InvalidArgument, messages.ErrStateGet, in.Key, in.StoreName, err.Error()) 225 | ei := errdetails.ErrorInfo{ 226 | Domain: "dapr.io", 227 | Reason: "DAPR_STATE_ETAG_MISMATCH", 228 | Metadata: map[string]string{ 229 | "storeName": in.StoreName, 230 | }, 231 | } 232 | ri := errdetails.ResourceInfo{ 233 | ResourceType: "state.redis/v1", 234 | ResourceName: "my-redis-component", 235 | Owner: "user", 236 | Description: "possible etag mismatch. error from state store", 237 | } 238 | ste, err2 := ste.WithDetails(&ei, &ri) 239 | ... 240 | return ste.Err() 241 | } 242 | ``` 243 | 244 | ### Pros 245 | - Since the Dapr Runtime is using protocol buffers as the data format, support for the richer error model is already included in most of the gRPC implementations. 246 | - This would help minimize the changes with the Dapr ecosystem. 247 | - This solution could be used to programmatically react to errors as it provides a standard structure for the errors with details. 248 | 249 | ### Cons 250 | - Dependencies on gPRC richer error model. 251 | - Need to test gRPC implementations support for all Dapr SDKs. 252 | 253 | 254 | ## gRPC Richer Error Model POC 255 | For the POC I've made changes to some parts of the Dapr modules (). The POC code can be found in my GH Repo under the branch **error-codes-poc** 256 | 257 | These are the gRPC imports used: 258 | 259 | ```go 260 | import ( 261 | ... 262 | "google.golang.org/genproto/googleapis/rpc/errdetails" 263 | "google.golang.org/grpc/codes" 264 | "google.golang.org/grpc/status" 265 | ... 266 | ) 267 | ``` 268 | 269 | The files changed for this POC: 270 | 271 | https://github.com/robertojrojas/components-contrib/tree/error-codes-poc 272 | 273 | - state/redis/redis.go 274 | - state/store.go 275 | 276 | https://github.com/robertojrojas/dapr-kit/tree/error-codes-poc 277 | 278 | - pkg/proto/customerrors/v1/customerrors.pb.go 279 | - proto/customerrors/v1/customerrors.proto 280 | - status/customerrorcodes.go 281 | - status/status.go 282 | 283 | https://github.com/robertojrojas/dapr/tree/error-codes-poc 284 | 285 | - pkg/diagnostics/grpc_tracing.go 286 | - pkg/grpc/api.go 287 | - pkg/http/api.go 288 | - pkg/http/responses.go 289 | 290 | https://github.com/robertojrojas/dapr-go-sdk/tree/error-codes-poc 291 | 292 | - client/state.go 293 | 294 | https://github.com/robertojrojas/dapr-cli/tree/error-codes-poc 295 | 296 | - pkg/standalone/invoke.go 297 | 298 | https://github.com/robertojrojas/dapr-dotnet-sdk/tree/error-codes-poc 299 | 300 | - src/Dapr.Client/DaprClientGrpc.cs 301 | 302 | ### Feature lifecycle outline 303 | 304 | ### Acceptance Criteria 305 | 306 | 307 | ## Completion Checklist 308 | 309 | -------------------------------------------------------------------------------- /20230627-P-proposal-sdk-approval.md: -------------------------------------------------------------------------------- 1 | # Cross SDK changes approval 2 | 3 | * Author(s): Artur Souza (@artursouza) 4 | * State: Ready 5 | * Updated: 06/27/2023 6 | 7 | ## Overview 8 | 9 | This defines the criteria to how SDK proposal are approved. 10 | 11 | ## Background 12 | 13 | The SDK Spec SIG will define the spec for how SDKs operate today but there are still multiple areas where SDKs design are inconsistent due to decisions made within each SDK repository's maintainer. There needs to be a process to improve consistency across SDKs, independent of the SDK Spec SIG, since there are ongoing discussions for new features that keep SDKs drifting away from each other. 14 | 15 | ## Proposal 16 | 17 | Proposals that impact multiple SDKs should be presented in this repository and will be approved if at least a simple majority (50% + 1) of the maintainers of SDK repositories approve the change. The vote counted is per person and not per repository. Voting ends when the simple majority is reached; cannot be mathematically reached; or is idle (no changes, comments or votes) for more than 30 calendar days. The list of maintainers are based on the maintainers listed on GitHub as of the time of the proposal submission. Voting should be done via a comment in the proposal's pull request in this repository. 18 | 19 | ## Approval of this proposal 20 | 21 | As a way to kickstart this process, STC should approve this proposal following the existing voting criteria by STC, commenting on this proposal's pull request for future reference. -------------------------------------------------------------------------------- /20230714-S-sdk-resiliency.md: -------------------------------------------------------------------------------- 1 | # Resiliency in Dapr SDKs 2 | 3 | * Author(s): Artur Souza (@artursouza) 4 | * State: Ready for Implementation 5 | * Updated: 07/14/2023 6 | 7 | ## Overview 8 | 9 | This is a design proposal to [support resiliency when SDKs invoke remote Dapr APIs](https://github.com/dapr/dapr/issues/6609). 10 | 11 | This will allow applications to talk to a remote or shared sidecar, without having to rely on custom retry and timeout logic in the user's application. 12 | 13 | ## Background 14 | 15 | ### Motivation 16 | - Applications to communicate to remote Dapr APIs when there is communication degradation. 17 | - Applications to communicate to sidecar when there is degradation of the sidecar's health. 18 | 19 | ### Goals 20 | - Dapr users can talk to a remote Dapr API without having to implement resiliency logic in their app. 21 | - System administrators don't need to have different configurations per application based on programming language, meaning the same configuration will work with every SDK. 22 | 23 | ### Current Shortfalls 24 | - Applications need to implement resiliency (retry and timeout) on top of existing SDK. 25 | 26 | ## Related Items 27 | 28 | ### Related proposals 29 | 30 | Formalizing the proposal here from [this issue](https://github.com/dapr/dapr/issues/6609). 31 | 32 | ## Expectations and alternatives 33 | 34 | * What is in scope for this proposal? 35 | - SDKs to support a consistent (and small) set of environment variables to configure resiliency on SDKs 36 | - Consistent set of retriable errors for gRPC and HTTP APIs. 37 | 38 | * What is deliberately *not* in scope? 39 | - Circuit Breaking 40 | - A highly configurable spec for resiliency policies (like the CRD in runtime) 41 | 42 | 43 | * What alternatives have been considered, and why do they not solve the problem? 44 | 1. Leave every SDK as-is: 45 | - Undetermined behavior when sidecar is down or too slow. For example, the Java SDK simply gets stuck forever if there is no response from the sidecar (tested with ToxiProxy). 46 | - Timeout and retry needs to be implemented at the user's application. 47 | 2. Add retry only 48 | - Undetermined behavior when sidecar is down or too slow. For example, the Java SDK simply gets stuck forever if there is no response from the sidecar (tested with ToxiProxy). 49 | 3. Let each SDK decide how to handle this. 50 | - Inconsistent behavior and configuration for resiliency, requiring system admins to know specifics of each SDK. 51 | 52 | * Are there any trade-offs being made? (space for time, for example) 53 | 1. Simplification of retry policy, having an opinionated setting for most configuration. 54 | 2. No support for Circuit Breaking or API health check prior to calling the Dapr API. 55 | 56 | * What advantages / disadvantages does this proposal have? 57 | Pros: 58 | - Bring consistency and simple set of configuration points that work cross SDKs 59 | - Document expected behavior for SDKs regarding timeout and retries 60 | 61 | Cons: 62 | - See trade-offs mentiond above. 63 | 64 | ## Implementation Details 65 | 66 | ### Design 67 | 68 | * `DAPR_API_MAX_RETRIES` defines the maximum number of retries, SDKs can determine which strategy will be implemented (linear, exponential backoff, etc). `0` is the default value and means no retry. `-1` or any negative value means infinite retries. 69 | * `DAPR_API_TIMEOUT_SECONDS` defines the maximum waiting time to connect and receive a response for an HTTP or gRPC call. Defaults to `0`. `0` (or negative) are handled as "undefined" and calls might hang forever on the client side. This setting is the timeout for each API invocation and not the timeout of the aggregated time for retries. This setting can be used without retries. 70 | * All environment variables can be overwritten via parameters to the Dapr client or at a per-request basis, in the following order (higher priority on top): 71 | 1. Per-request parameter 72 | 2. Parameter when instantiating a Dapr client object 73 | 3. Properties or any other language specific configuration framework. 74 | 4. Environment variables 75 | * SDK to retry if error is on connection. 76 | * SDK to retry in case of the following retriable codes: 77 | * gRPC: DEADLINE_EXCEEDED, UNAVAILABLE. 78 | * HTTP: 408, 429 (respect `Retry-After` header), 500, 502, 503, 504 79 | * The same client should still be usable if the API goes down but is restored after any arbitrary amount of time. In other words, the unavailability of the Dapr API should not require the application to restart. 80 | 81 | #### Example of implementation 82 | 83 | https://github.com/dapr/java-sdk/pull/889 84 | 85 | ### Feature lifecycle outline 86 | 87 | * Compatability guarantees 88 | Retries and timeouts should be disabled by default. 89 | 90 | * Deprecation / co-existence with existing functionality 91 | If customers prefer to have a more fine tuned resiliency logic, they can still achieve so by disabling the SDK resiliency and use a 3rd party library to handle retries with custom logic. 92 | 93 | * Feature flags 94 | Retries and timeouts are disabled by default with the value `0`. 95 | 96 | ### Acceptance Criteria 97 | 98 | How will success be measured? 99 | 100 | * Performance targets 101 | N/A 102 | 103 | * Compabitility requirements 104 | Same environment variables work with any SDK. 105 | SDKs to pass a new compatibility test (in runtime). 106 | 107 | * Metrics 108 | N/A 109 | 110 | ## Completion Checklist 111 | 112 | What changes or actions are required to make this proposal complete? 113 | 114 | * SDK changes 115 | * Add support for new environment variable 116 | * Add new parameters when instantiating a new Dapr client 117 | * Add per-request optional parameters 118 | * Add integration testing on each SDK when possible (can use ToxiProxy) 119 | * Compatibility tests 120 | * Implement a compatibility test in runtime (similar to what was done for actor invocation) 121 | * Documentation 122 | 123 | -------------------------------------------------------------------------------- /20230918-S-unified-api-token-env-variable.md: -------------------------------------------------------------------------------- 1 | # Unify the `DAPR_API_TOKEN` env variable across all SDKs 2 | 3 | * Author(s): Elena Kolevska (@elena-kolevska) 4 | * State: Ready for Implementation 5 | * Updated: 2023-09-18 6 | 7 | ## Overview 8 | 9 | This is a design proposal to unify usage of the `DAPR_API_TOKEN` variable across all SDKs. 10 | 11 | ## Background 12 | 13 | Currently, the `DAPR_API_TOKEN` env variable is used in some SDKs, but not in others. For example, the Python SDK uses it, but the Javascript SDK does not. This can be confusing for users, and makes it difficult to write documentation that is consistent across all SDKs. It is also an unnecessary complication for maintainers who switch between SDKs. 14 | 15 | And finally, with a unified environment variable across all SDKs system administrators don't need to have different configurations per application based on programming language, because the same environment variables will work with every SDK. 16 | 17 | ## Related Items 18 | 19 | This has already been discussed in different repos, notably in [this issue](https://github.com/dapr/java-sdk/issues/303) in the java-sdk. 20 | 21 | We are moving to unify other environment variable across all SDKs, as discussed in [this proposal](https://github.com/dapr/proposals/blob/main/0008-S-sidecar-endpoint-tls.md). 22 | 23 | 24 | ## Expectations and alternatives 25 | 26 | **What is in scope for this proposal?** 27 | - All supported SDKs need to consistently support the `DAPR_API_TOKEN` environment variable for authentication to Dapr 28 | 29 | **SDKs that already use the `DAPR_API_TOKEN` env variable:** 30 | - java-sdk 31 | - go-sdk 32 | - dotnet-sdk 33 | - python-sdk 34 | - php-sdk 35 | 36 | **SDKs that currently don't use the `DAPR_API_TOKEN` env variable:** 37 | - js-sdk 38 | - rust-sdk 39 | - cpp-sdk 40 | 41 | 42 | ## Implementation Details 43 | 44 | ### Design 45 | 46 | - The `DAPR_API_TOKEN` environment variable will be used in all SDKs to authenticate to Dapr. The token will be passed to the Dapr sidecar via the `dapr-api-token` header. 47 | 48 | 49 | ### Feature lifecycle outline 50 | 51 | - Some SDKs currently accept the a Dapr API token as an argument through the constructor. In this case, when a token is specified through the constructor of the client, it will take precedence over the environment variable. 52 | - If there is an existing environment variable for the Dapr api token by a different name, the new `DAPR_API_TOKEN` variable will take precedence. 53 | 54 | ### Acceptance Criteria 55 | 56 | How will success be measured? 57 | 58 | * Performance targets: N/A 59 | * Compabitility requirements: Same environment variables work with any SDK 60 | * Metrics: N/A 61 | 62 | ## Completion Checklist 63 | 64 | What changes or actions are required to make this proposal complete? 65 | 66 | * SDK changes (if needed) 67 | - Add support for the `DAPR_API_TOKEN` environment variable to all supported SDKs 68 | - Add integration testing on each SDK when possible 69 | * Documentation 70 | - Update documentation to reflect the new environment variable 71 | 72 | -------------------------------------------------------------------------------- /20231024-CIR-trust-distribution.md: -------------------------------------------------------------------------------- 1 | # Dapr Trust Distribution 2 | 3 | * Author(s): joshvanl 4 | * State: Approved 5 | * Updated: 2023-10-24 6 | 7 | ## Overview 8 | 9 | This is a design proposal to implement a proper trust distribution process in Dapr. 10 | Trust distribution will be implemented in a seamless way without downtime. 11 | This will improve and unlock security related features. 12 | 13 | ## Background 14 | 15 | ### Motivation 16 | 17 | To support the creation of features: 18 | 19 | - Proper Certificate Authority (CA) rotation (without re-using the root's private key) 20 | - External CA sources such as cert-manager and cloud provider CAs etc. 21 | - Dapr multi-cluster and networking federation 22 | 23 | Related issues: 24 | - [Multicluster Kubernetes](https://github.com/dapr/dapr/issues/3460) 25 | - [[Proposal] Support third-party CA - Integrate Cert Manager with Dapr](https://github.com/dapr/dapr/issues/3968) 26 | - [[Proposal] Automatic root certificate rotation](https://github.com/dapr/dapr/issues/5958) 27 | 28 | ### Goals 29 | 30 | - Implement an active trust distribution mechanism for Dapr in Kubernetes that responds to updates. 31 | - Trust distribution in self-hosted mode can be be implemented by the user. 32 | - Enable root certificate rotation with no downtime. 33 | 34 | ### Non-Goals 35 | 36 | - Sentry implements CA root rotation. 37 | - Implement external CA support, though this proposal will enable this feature to be developed in the future. 38 | - Implement Dapr trust federation. 39 | 40 | ### Current Shortfalls 41 | 42 | Trust distribution is the act of propagating trust data to enable secure communication or networking between peers. 43 | In the case of Dapr, this involves propagating PEM encoded CA bundle files to clients and servers in the cluster, which are then used to authenticate peers over TLS. 44 | Today, their are two methods of CA deployment to Dapr; either Sentry generated or provided by the Dapr cluster administrator. 45 | From the prospective of trust distribution, these two modes are functionally the same as they both result in the `dapr-trust-bundle` ConfigMap containing the CA bundle, and the `dapr-trust-bundle` Secret containing the issuer certificate chain. 46 | From here, trust distribution for the control plane occurs by the Operator, Placement, and Injector services reading from the mounted `dapr-trust-bundle` ConfigMap. 47 | Trust distribution for Daprds occurs from the Injector patching the Daprd container with an environment variable containing the CA Bundle originating from the `dapr-trust-bundle` ConfigMap. 48 | 49 | The problem with the current strategy is that once trust is distributed once (the `dapr-trust-bundle` ConfigMap and Secret is populated), the root of trust cannot change in the cluster. 50 | This is because trust bundles are only injected to Dapr containers at Pod creation time. 51 | Trust anchors are also set as environment variables whose values are static for the entire duration of a unix process, meaning they cannot be dynamically updated, for example in the event of CA root rotation. 52 | Today, Dapr pods will have to be restarted in order to pick-up a new trust bundle. 53 | It is also undefined and untested as to whether the control plane components will successfully pick-up a new trust bundle during execution; though whether they can is irrelevant as Daprds do not also support this feature. 54 | 55 | ## Solution 56 | 57 | ### Trust Distributor 58 | 59 | It is paramount that the entity that conducts the trust distribution is separate from the entity that issues identities from that root of trust, in Dapr's case this is Sentry. 60 | This is because trust distribution must happen out of band of identity issuance. 61 | An analogous to this is roots of trust of the Internet are delivered via the computers Operating System or Internet Browser, rather than fetching them from DNS servers themselves. 62 | Similarly for example, asset SHA hashes should be downloaded from a separate source then the assert server themselves. 63 | Decoupling these roles also has the benefit of improving separation of concerns between responsibilities from the identity issuer, and the trust distributor. 64 | 65 | Trust distribution will be conducted by the Operator and written to the ConfigMap `dapr-root-ca.crt` in all Namespaces. 66 | The Operator is a natural fit as it is not Sentry (the identity issuance server), and machinery for Kubernetes controllers already exists in the Operator today. 67 | ConfigMaps are a natural choice as they can be mounted by Pods & containers in Kubernetes, and trust bundles are not secrets so Secrets are not appropriate. 68 | There is also prior art to other projects distributing trust in this way, such as [Istio](https://github.com/istio/istio/blob/4c65649a9b116584281fadcaf8c3dd6b42d34036/istioctl/pkg/workload/workload_test.go#L340) and cert-manager's [trust-manager](https://github.com/cert-manager/trust-manager#example-bundle). 69 | We can also add support for writing to Kubernetes [ClusterTrustBundles](https://github.com/kubernetes/enhancements/issues/3257), though this resource is very new, and will not be available in all target Kubernetes cluster versions. 70 | The `dapr-root-ca.crt` ConfigMap name is consistent with Kubernetes and Istio naming. 71 | The operator will source the root of trust to be distributed from the mounted `dapr-root-ca.crt` Secret in the control plane namespace. 72 | The operator will watch for this mounted file for updates using [fsnotify](https://github.com/fsnotify/fsnotify), and distribute the contents to the named ConfigMap in all namespaces. 73 | 74 | Once propagated, the Injector, Placement, and Dapr sidecars can mount this ConfigMap and use it as the root of trust when connecting to peers. 75 | Similarly, when the file is updated, these services can update their local trust stores to use the new version of the bundle. 76 | 77 | The Operator will need to metadata watch all Namespaces and ConfigMaps in the cluster. 78 | The Operator should no fully inform these resources as that will massively increase the memory consumption of the Operator. 79 | In the event of a Namespace being created, the Operator will write the ConfigMap to that namespace. 80 | The Operator will also ensure that the `dapr-root-ca.crt` ConfigMap stays consistent with its local trust bundle version. 81 | 82 | ### CA Rotation 83 | 84 | CA rotation can now be solved by the new trust bundle being _appended_ to the existing `dapr-root-ca.crt` Secret in the control plane namespace. 85 | This new bundle containing the old and new CA will be propagated to all services by the Operator, allowing for a zero downtime & graceful roll over of the CA. 86 | The Dapr CLI will be updated to automate this task, ensuring that the new appended trust bundle has been correctly propergated to all namespaces before writing the new CA to sentry. 87 | Checking propagation involves ensuring the new bundle contents is present at the named ConfigMap in all namespaces. 88 | The CLI needs to take care of the fact that mounted ConfigMaps can take up to [60 seconds](https://github.com/kubernetes/kubernetes/blob/v1.26.0/pkg/kubelet/pod_workers.go#L1175C1-L1175C96) before the file is updated on the container mount, so there is some lag between the ConfigMap being updated and the trust bundle being updated in a service's trust store. 89 | 90 | ### External CA Support 91 | 92 | External CA support is now made easier by the fact that the external CA trust bundle can be safely written to the `dapr-cert-ca.crt` Secret in the control plane namespace. 93 | Dapr services will now trust the external CA's root of trust. 94 | 95 | ### Dapr multi-cluster and Networking Federation 96 | 97 | Similarly, the trust bundle of another Dapr cluster can be appended to the existing CA bundle so that the two clusters may trust one another. 98 | 99 | ### Self Hosted Mode 100 | 101 | Self hosted mode will continue to function as before, however using a file reference for the trust bundle rather than an environment variable means that services can respond and update their trust stores on file changes. 102 | 103 | ### Deprecation 104 | 105 | The `DAPR_TRUST_ANCHORS` environment variable in Daprd will become deprecated, and instead favour using a file reference configured via the CLI flag `-trust-anchors-file`. 106 | For backwards compoatabiliy, the `DAPR_TRUST_ANCHORS` environment variable will continue to be supported until `v1.14`, where the Injector service will no long patch it into Daprd sidecar containers. 107 | 108 | ## Completion Checklist 109 | 110 | - [ ] In Kubernetes CA mode, Sentry writes its own generated CA bundle to the `dapr-root-ca.crt` Secret in the control plane namespace. 111 | - [ ] The Operator propagates this trust bundle to the `dapr-root-ca.crt` ConfigMap in all namespaces. 112 | - [ ] Placement, Injector, and Daprds all read and watch the trust anchors from the mounted `dapr-root-ca.crt` ConigMap referenced by the `-trust-anchors-file` flag's value, updating their trust stores accordingly. 113 | - [ ] Dapr CLI CA rotation command updated to respect the `dapr-root-ca.crt` Secret and append the CA bundle accordingly. 114 | 115 | ### Acceptance Criteria 116 | - Trust distribution is active and responds to updates. 117 | - Dapr CLI `mtls renew-certificate` has been updated to implement proper CA rotation. 118 | -------------------------------------------------------------------------------- /20240508-S-sidecar-endpoint-tls.md: -------------------------------------------------------------------------------- 1 | # Dapr endpoint env and TLS support in SDKs 2 | 3 | * Author(s): Artur Souza (@artursouza), Josh van Leeuwen (@JoshVanL) 4 | * State: Ready for Implementation 5 | * Updated: 05/08/2024 6 | 7 | ## Overview 8 | 9 | This is a design proposal to [support remote or shared Dapr APIs](https://github.com/dapr/dapr/issues/6035). 10 | 11 | This will allow applications to talk to a remote or shared sidecar, without having to rely on localhost sidecar running per app instance. It means the communication will likely require TLS communication. 12 | 13 | ## Background 14 | 15 | ### Motivation 16 | - Applications to communicate to Dapr APIs without a local sidecar. 17 | 18 | ### Goals 19 | - Dapr users can talk to a remote Dapr API without using CLI or any other tool, by just running the application with environment variables. 20 | - System administrators don't need to have different configurations per application based on programming language, meaning the same environment variables will work with every SDK - exception is when SDK only supports HTTP or GRPC, but sysadmin can simply always setup environment variables for both protocols to guarantee consistency. 21 | 22 | ### Current Shortfalls 23 | - Inconsistency on setting up Dapr's sidecar endpoint on each SDK. 24 | - Not every SDK support a secure endpoint. 25 | 26 | ## Related Items 27 | 28 | ### Related proposals 29 | 30 | Formalizing the proposal here from [this issue](https://github.com/dapr/dapr/issues/6035). 31 | 32 | ## Expectations and alternatives 33 | 34 | * What is in scope for this proposal? 35 | - SDKs to support a consistent pair of environment variables to setup Dapr API 36 | - SDKs to support TLS endpoints for Dapr API 37 | 38 | * What is deliberately *not* in scope? 39 | - SSL certificate pinning 40 | - Have consistency of other environment variables for SDK (`DAPR_HOST`, `DAPR_SIDECAR_IP`, etc) 41 | - Have consistency of how Dapr client is instanciated on each SDK 42 | 43 | 44 | * What alternatives have been considered, and why do they not solve the problem? 45 | 1. Leave every SDK as-is: 46 | - Not every SDK offers an environment variable to configure Dapr endpoint, forcing configuration in code 47 | - Environment variables per SDK, forcing sysadmin to know about each application's language use 48 | - Not every SDK supports TLS endpoint 49 | 2. Add TLS support only, giving each SDK room to decide on how to expose it to the user 50 | - Not every SDK offers an environment variable to configure Dapr endpoint, forcing configuration in code 51 | - Environment variables per SDK, forcing sysadmin to know about each application's language use 52 | 53 | * Are there any trade-offs being made? (space for time, for example) 54 | 1. Leaving existing environment variables for host and port as-is per SDK, but driving consistency on this new way. 55 | 2. Not changing Dapr's DAPR_HOST (or equivalent), DAPR_HTTP_PORT and DAPR_GRPC_PORT. 56 | 57 | * What advantages / disadvantages does this proposal have? 58 | Pros: 59 | - Bring consistency in Dapr API endpoint configuration cross SDKs 60 | - Add support for TLS endpoint 61 | 62 | Cons: 63 | - Does not address existing inconsistencies in client instantiation and env variables 64 | - Needs to define a priority between new env variables and old ones 65 | 66 | ## Implementation Details 67 | 68 | ### Design 69 | 70 | * `DAPR_GRPC_ENDPOINT` defines entire endpoint for gRPC, not just host: `dapr-grpc.mycompany.com`. No port in the URL defaults to 443. 71 | * `DAPR_HTTP_ENDPOINT` defines entire endpoint for HTTP, not just host: `https://dapr-http.mycompany.com` 72 | * Port is parsed from the hostport string (`dapr.mycompany.com:8080`) or via the default port of the protocol used in the URL (80 for `plaintext` and 443 for `TLS`) 73 | * `DAPR_GRPC_ENDPOINT` and `DAPR_HTTP_ENDPOINT` can be set at the same time since some SDKs (Java, as of now) supports both protocols at the same time and app can pick which one to use. 74 | * `DAPR_HTTP_ENDPOINT` must be parsed and the protocol will be used by SDK to determine if communication is over TLS (if not done automatically). In summary, `https` means secure channel. 75 | * `DAPR_GRPC_ENDPOINT` must be parsed and the query parameter will be used to determine whether the endpoint uses TLS. In summary, `?tls=true` means to use TLS. An empty query parameter defaults TLS to false. SDKs should error on unrecognised or invalid query parameters. 76 | * `DAPR_GRPC_ENDPOINT` and `DAPR_HTTP_ENDPOINT` have priority over existing `DAPR_HOST` and `DAPR_HTTP_PORT` or `DAPR_GRPC_PORT` environment variables. Application's hardcoded values passed via constructor takes priority over any environment variable. In summary, this is the priority list (highest on top): 77 | 1. Values passed via constructor or builder method. 78 | 2. Properties or any other language specific configuration framework. 79 | 3. `DAPR_GRPC_ENDPOINT` and `DAPR_HTTP_ENDPOINT` 80 | 4. Existing `DAPR_HOST` (or equivalent, defaulting to `127.0.0.1`) + `DAPR_HTTP_PORT` or `DAPR_GRPC_PORT` 81 | 82 | `DAPR_GRPC_ENDPOINT` host port parsing example: 83 | 84 | ``` 85 | myhost => port=443 tls=false resolver=dns 86 | myhost?tls=false => port=443 tls=false resolver=dns 87 | myhost:443 => port=443 tls=false resolver=dns 88 | myhost:1003 => port=1003 tls=false resolver=dns 89 | myhost:1003?tls=true => port=1003 tls=true resolver=dns 90 | dns://myhost:1003?tls=true => port=1003 tls=true resolver=dns 91 | unix://my.sock => port= tls=false resolver=unix 92 | unix://my.sock?tls=true => port= tls=true resolver=unix 93 | http://myhost => port=80 tls=false resolver=dns 94 | https://myhost => port=443 tls=true resolver=dns 95 | ``` 96 | 97 | #### Example of implementation 98 | 99 | https://github.com/dapr/java-sdk/blob/76aec01e9aa4af7a72b910d77685ddd3f0bf86f3/sdk/src/main/java/io/dapr/client/DaprClientBuilder.java#L172C3-L192 100 | 101 | ### Feature lifecycle outline 102 | 103 | * Compatability guarantees 104 | This feature should allow localhost definition too `http://127.0.0.1:3500`, for example. 105 | 106 | * This feature should continue to allow using other resolvers other than DNS (e.g. 107 | `unix://`). 108 | 109 | * Deprecation / co-existence with existing functionality 110 | This feature takes priority over existing (inconsistent) environment variables from each SDK. If app provides a hardcoded value for Dapr endpoint (via constructor, for example), it takes priority. 111 | Use of existing `DAPR_API_TOKEN` environment variables is highly encouraged for remote API but not required. 112 | 113 | * SDKs will continue to accept the old behaviour of DAPR_GRPC_ENPOINT` with 114 | the scheme value `https` to signal to use TLS. Where a value contains both the 115 | `https` scheme and `?tls=false` query, SDKs will error and refuse to connect. 116 | 117 | * Feature flags 118 | N/A 119 | 120 | ### Acceptance Criteria 121 | 122 | How will success be measured? 123 | 124 | * Performance targets 125 | N/A 126 | 127 | * Compabitility requirements 128 | Same environment variables work with any SDK - except if protocol is not supported by given SDK. 129 | 130 | * Metrics 131 | N/A 132 | 133 | ## Completion Checklist 134 | 135 | What changes or actions are required to make this proposal complete? 136 | 137 | * SDK changes 138 | * Add support for new environment variable 139 | * Add integration testing on each SDK when possible 140 | * Documentation 141 | 142 | ## Test matrix 143 | 144 | | URL | Endpoint string to pass to grpc client | Hostname | Port | TLS | Error | 145 | | ------------------------------------------------------------ | ----------------------------------------- | --------------------------------- | ---- | ------ | --------------------------------------------------------------------------- | 146 | | :5000 | dns:localhost:5000 | localhost | 5000 | FALSE | | 147 | | :5000?tls=false | dns:localhost:5000 | localhost | 5000 | FALSE | | 148 | | :5000?tls=true | dns:localhost:5000 | localhost | 5000 | TRUE | | 149 | | myhost | dns:myhost:443 | myhost | 443 | FALSE | | 150 | | myhost?tls=false | dns:myhost:443 | myhost | 443 | FALSE | | 151 | | myhost?tls=true | dns:myhost:443 | myhost | 443 | TRUE | | 152 | | myhost:443 | dns:myhost:443 | myhost | 443 | FALSE | | 153 | | myhost:443?tls=false | dns:myhost:443 | myhost | 443 | FALSE | | 154 | | myhost:443?tls=true | dns:myhost:443 | myhost | 443 | TRUE | | 155 | | [http://myhost](http://myhost) | dns:myhost:80 | myhost | 80 | FALSE | | 156 | | [http://myhost?tls=false](http://myhost?tls=false) | | | | | the tls query parameter is not supported for http(s) endpoints: 'tls=false' | 157 | | [http://myhost?tls=true](http://myhost?tls=true) | | | | | the tls query parameter is not supported for http(s) endpoints: 'tls=true' | 158 | | [http://myhost:443](http://myhost:443) | dns:myhost:443 | myhost | 443 | FALSE | | 159 | | [http://myhost:443?tls=false](http://myhost:443?tls=false) | | | | | the tls query parameter is not supported for http(s) endpoints: 'tls=false' | 160 | | [http://myhost:443?tls=true](http://myhost:443?tls=true) | | | | | the tls query parameter is not supported for http(s) endpoints: 'tls=true' | 161 | | [http://myhost:5000](http://myhost:5000) | dns:myhost:5000 | myhost | 5000 | FALSE | | 162 | | [http://myhost:5000?tls=false](http://myhost:5000?tls=false) | | | | | the tls query parameter is not supported for http(s) endpoints: 'tls=false' | 163 | | [http://myhost:5000?tls=true](http://myhost:5000?tls=true) | | | | | the tls query parameter is not supported for http(s) endpoints: 'tls=true' | 164 | | [https://myhost:443](https://myhost:443) | dns:myhost:443 | myhost | 443 | TRUE | | 165 | | [https://myhost:443?tls=false](https://myhost:443?tls=false) | | | | | the tls query parameter is not supported for http(s) endpoints: 'tls=false' | 166 | | [https://myhost:443?tls=true](https://myhost:443?tls=true) | | | | | the tls query parameter is not supported for http(s) endpoints: 'tls=true' | 167 | | dns:myhost | dns:myhost:443 | myhost | 443 | FALSE | | 168 | | dns:myhost?tls=false | dns:myhost:443 | myhost | 443 | FALSE | | 169 | | dns:myhost?tls=true | dns:myhost:443 | myhost | 443 | TRUE | | 170 | | dns://myauthority:53/myhost | dns://myauthority:53/myhost:443 | myhost | 443 | FALSE | | 171 | | dns://myauthority:53/myhost?tls=false | dns://myauthority:53/myhost:443 | myhost | 443 | FALSE | | 172 | | dns://myauthority:53/myhost?tls=true | dns://myauthority:53/myhost:443 | myhost | 443 | TRUE | | 173 | | dns://myhost | | | | | invalid dns authority 'myhost' in URL 'dns://myhost' | 174 | | unix:my.sock | unix:my.sock | my.sock | | FALSE | | 175 | | unix:my.sock?tls=true | unix:my.sock | my.sock | | TRUE | | 176 | | unix://my.sock | unix://my.sock | my.sock | | FALSE | | 177 | | unix://my.sock?tls=true | unix://my.sock | my.sock | | TRUE | | 178 | | unix-abstract:my.sock | unix-abstract:my.sock | my.sock | | FALSE | | 179 | | unix-abstract:my.sock?tls=true | unix-abstract:my.sock | my.sock | | TRUE | | 180 | | vsock:mycid:5000 | vsock:mycid:5000 | mycid | 5000 | FALSE | | 181 | | vsock:mycid:5000?tls=true | vsock:mycid:5000 | mycid | 5000 | TRUE | | 182 | | [2001:db8:1f70::999:de8:7648:6e8] | dns:[2001:db8:1f70::999:de8:7648:6e8]:443 | [2001:db8:1f70::999:de8:7648:6e8] | 443 | FALSE | | 183 | | dns:[2001:db8:1f70::999:de8:7648:6e8]:5000 | dns:[2001:db8:1f70::999:de8:7648:6e8]:5000 | [2001:db8:1f70::999:de8:7648:6e8] | 5000 | FALSE | | 184 | | dns:[2001:db8:1f70::999:de8:7648:6e8]:5000?abc=[] | | | | | Error: query parameters are not supported for gRPC endpoints: 'abc=[]' | 185 | | dns://myauthority:53/[2001:db8:1f70::999:de8:7648:6e8] | dns://myauthority:53/[2001:db8:1f70::999:de8:7648:6e8]:443 | [2001:db8:1f70::999:de8:7648:6e8] | 443 | FALSE | | 186 | | dns:[2001:db8:1f70::999:de8:7648:6e8] | dns:[2001:db8:1f70::999:de8:7648:6e8]:443 | [2001:db8:1f70::999:de8:7648:6e8] | 443 | FALSE | | 187 | | https://[2001:db8:1f70::999:de8:7648:6e8] | dns:[2001:db8:1f70::999:de8:7648:6e8]:80 | [2001:db8:1f70::999:de8:7648:6e8] | 80 | TRUE | | 188 | | https://[2001:db8:1f70::999:de8:7648:6e8]:5000 | dns:[2001:db8:1f70::999:de8:7648:6e8]:5000 | [2001:db8:1f70::999:de8:7648:6e8] | 5000 | TRUE | | 189 | | host:5000/v1/dapr | | | | | paths are not supported for gRPC endpoints: '/v1/dapr' | 190 | | host:5000/?a=1 | | | | | paths are not supported for gRPC endpoints: '/' | 191 | | inv-scheme://myhost | | | | | invalid scheme 'inv-scheme' in URL 'inv-scheme://myhost' | 192 | | inv-scheme:myhost:5000 | | | | | invalid scheme 'inv-scheme' in URL 'inv-scheme:myhost:5000' | 193 | -------------------------------------------------------------------------------- /20240517-R-http-metrics-path-matching.md: -------------------------------------------------------------------------------- 1 | # HTTP Metrics Path Matching 2 | 3 | * Author(s): @nelson-parente @jmprusi 4 | * State: Ready for Implementation 5 | * Updated: 2024-05-17 6 | 7 | ## Overview 8 | 9 | This is a design proposal to implement a new opt-in API for path matching within Dapr HTTP metrics. By enabling path matching users can define paths that will be matched and replaced without being at risk of unbounded path cardinality and other issues that motivate the introduction of the low cardinality mode in Dapr. This will enable users to have more meaningful and manageable metrics in a controlled way. 10 | 11 | ## Background 12 | 13 | In [#6723](https://github.com/dapr/dapr/issues/6723), Dapr reduced the cardinality of its HTTP metrics in order to address memory issues users reported and restrain unbounded path cardinality which posed as a security threat. This change introduced two cardinality modes (high/low) controlled by the `increasedCardinality` flag. 14 | 15 | The caveat with low cardinality is that it dropped paths since they were one of the sources for the high cardinality. While this is a reasonable approach, it leads in the loss of important data needed for monitoring, performance analysis, and troubleshooting. To address this, we opened [#7719](https://github.com/dapr/dapr/issues/7719). 16 | 17 | This proposal introduces an opt-in API that allows users to define the paths that matter the most, effectively adding matched paths to metrics without relying on regex's, which are known to be CPU-intensive. 18 | 19 | With this API, users will be able to configure path matching through a simple interface, providing the paths they care about and tailoring metrics to their specific requirements without compromising memory and security issues. 20 | 21 | ## Related Items 22 | 23 | ### Related issues 24 | 25 | Initial low cardinality issue: [#6723](https://github.com/dapr/dapr/issues/6723) 26 | Issue related with low cardinality dropped metrics data: [#7719](https://github.com/dapr/dapr/issues/7719) 27 | 28 | ## Expectations and alternatives 29 | 30 | The proposed solution adds value to users' observability without compromising security and memory usage. The API is designed to be simple to configure, allowing users to configure the paths they care about. We considered other regex-based solutions but these are known to be CPU-intensive and can lead to performance degradation. 31 | 32 | ## Implementation Details 33 | 34 | ### Solution 35 | 36 | This proposal introduces an opt-in API for path matching within Dapr HTTP metrics. The goal is to offer a way to match and include paths in the metrics without relying on CPU-intensive regex's and with a guarantee that path cardinality is controlled. 37 | 38 | ```yaml 39 | spec: 40 | metric: 41 | enabled: true 42 | http: 43 | increasedCardinality: true 44 | pathMatching: 45 | - /orders/{orderID}/items/{itemID} 46 | - /users/{userID} 47 | - /categories/{categoryID}/subcategories/{subCategoryID} 48 | - /customers/{customerID}/orders/{orderID} 49 | ``` 50 | 51 | ##### Examples 52 | 53 | Examples of how the Path Matching API can be used in the metrics. The examples compare the metric `dapr_http_server_request_count` with the possible configuration combinations: low and high cardinality, with and without path matching. 54 | 55 | - Low Cardinality Without Path Matching 56 | 57 | 58 | ```yaml 59 | http: 60 | increasedCardinality: false 61 | ``` 62 | 63 | ``` 64 | dapr_http_server_request_count{app_id="ping",method="InvokeService/ping",status="200"} 5 65 | ``` 66 | - Low Cardinality With Path Matching 67 | 68 | ```yaml 69 | http: 70 | increasedCardinality: false 71 | pathMatching: 72 | - /orders/{orderID} 73 | ``` 74 | 75 | ``` 76 | dapr_http_server_request_count{app_id="ping",method="GET",path="/orders/{orderID}",status="200"} 4 77 | dapr_http_server_request_count{app_id="ping",method="GET",path="",status="200"} 1 78 | ``` 79 | 80 | - High Cardinality Without Path Matching 81 | 82 | ```yaml 83 | http: 84 | increasedCardinality: true 85 | ``` 86 | 87 | ``` 88 | dapr_http_server_request_count{app_id="ping",method="GET",path="/items/123456",status="200"} 1 89 | dapr_http_server_request_count{app_id="ping",method="GET",path="/orders/1234",status="200"} 1 90 | dapr_http_server_request_count{app_id="ping",method="GET",path="/orders/12345",status="200"} 1 91 | dapr_http_server_request_count{app_id="ping",method="GET",path="/orders/123456",status="200"} 1 92 | dapr_http_server_request_count{app_id="ping",method="GET",path="/orders/1234567",status="200"} 1 93 | ``` 94 | 95 | - High Cardinality With Path Matching 96 | 97 | ```yaml 98 | http: 99 | increasedCardinality: true 100 | pathMatching: 101 | - /orders/{orderID} 102 | ``` 103 | 104 | ``` 105 | dapr_http_server_request_count{app_id="ping",method="GET",path="/items/123456",status="200"} 1 106 | dapr_http_server_request_count{app_id="ping",method="GET",path="/orders/{orderID}",status="200"} 4 107 | ``` 108 | #### Features 109 | 110 | - `pathMatching` where users can specify paths for path matching. 111 | 112 | The path matching will use the same patterns as the Go standard library (see https://pkg.go.dev/net/http#hdr-Patterns), ensuring reliable and well-supported path matching. 113 | 114 | When the increasedCardinality flag is set to false (default in 1.14), non-matched paths are transformed into a catch-all bucket to control and limit cardinality, preventing unbounded growth. On the other hand, when increasedCardinality is true, non-matched paths are passed through as they normally would be, allowing for potentially higher cardinality but preserving the original path data. This is the main difference in the feature behavior when used with low versus high cardinality. 115 | 116 | This Path Matching API empowers users that rely on the metrics and observability scrapped from low cardinality that will soon be the default, providing a controlled means to manage path cardinality. Those indifferent may opt for low cardinality, while legacy high cardinality mode remains available for alternative needs. 117 | 118 | ### Acceptance Criteria 119 | 120 | - The Path Matching API is successfully integrated into the Dapr runtime, allowing users to enable/disable path matching via configuration. 121 | 122 | ## Completion Checklist 123 | 124 | - [ ] Implementation in daprd 125 | - [ ] API documentation 126 | - [ ] Integration, E2E tests 127 | 128 | -------------------------------------------------------------------------------- /20240618-RCBS-Conversation-building-block.md: -------------------------------------------------------------------------------- 1 | # Conversation building block 2 | 3 | * Author(s): Loong Dai (@daixiang0) 4 | * Updated: 2024-06-18 5 | 6 | ## Overview 7 | 8 | This is a proposal for a new building block for Dapr to allow developers to leverage LLM services in a consistent way. Goal is to expose an API that allows developers to ask Dapr to do request. 9 | 10 | ## Background 11 | 12 | Now there are many large language model servers or toolkits, which provides own APIs, like [OpenAI](https://openai.com/), [Hugging Face](https://huggingface.co/), [Kserve](https://kserve.github.io/website/latest/), [OpenVINO](https://docs.openvino.ai/) and so on. 13 | 14 | For developers, it is hard to migrate from one to the other due to hardcode and API differences. 15 | 16 | For startups and communities, they need to implement the popular APIs as soon as possible, or users maybe give up tring or adopting because of the cost of migration. 17 | 18 | This is an area where Dapr can help. We can offer an abstraction layer on those APIs. 19 | 20 | ## Component YAML 21 | 22 | A component can have it's own set of attributes, like in Dapr. For example: 23 | 24 | ```yaml 25 | apiVersion: dapr.io/v1alpha1 26 | kind: Component 27 | metadata: 28 | name: chatgpt4o 29 | spec: 30 | type: conversation.chatgpt 31 | version: v1 32 | metadata: 33 | - name: key 34 | value: "bfnskdlgdhklhk53adfgsfnsgmtyqdghbid34891" 35 | - name: model 36 | value: "gpt-4o" 37 | - name: endpoints 38 | value: "us.api.openai.com,eu.api.openai.com" 39 | - name: loadBalancingPolicy 40 | value: "ROUNDROBIN" 41 | ``` 42 | 43 | ## gRPC APIs 44 | 45 | In the Dapr gRPC APIs, we are extending the `runtime.v1.Dapr` service to add new methods: 46 | 47 | > Note: APIs will have "Alpha1" added while in preview 48 | 49 | > Note: The API token is stored in the component 50 | 51 | ```proto 52 | // (Existing Dapr service) 53 | service Dapr { 54 | // Conversate. 55 | rpc Converse(stream ConversationRequest) returns (stream ConversationResponse); 56 | } 57 | 58 | // ConversationRequest is the request object for Conversation. 59 | message ConversationRequest { 60 | // The name of Conversation component 61 | string name = 1; 62 | // Conversation context - the Id of an existing chat room (like in ChatGPT) 63 | optional string conversationContext = 2; 64 | // Inputs for the conversation, support multiple input in one time. 65 | repeated string inputs = 3; 66 | // Parameters for all custom fields. 67 | repeated google.protobuf.Any parameters = 4; 68 | } 69 | 70 | // ConversationResult is the result for one input. 71 | message ConversationResult { 72 | // Result for the one conversation input. 73 | string result = 1; 74 | // Parameters for all custom fields. 75 | repeated google.protobuf.Any parameters = 2; 76 | } 77 | 78 | // ConversationResponse is the response for Conversation. 79 | message ConversationResponse { 80 | // Conversation context - the Id of an existing or newly created chat room (like in ChatGPT) 81 | optional string conversationContext = 1; 82 | 83 | // An array of results. 84 | repeated ConversationResult outputs = 2; 85 | } 86 | ``` 87 | 88 | ## HTTP APIs 89 | 90 | The HTTP APIs are same with the gRPC APIs: 91 | 92 | `POST /v1.0/conversation/[component]/converse` -> Conversate 93 | 94 | ```json 95 | REQUEST = { 96 | "conversationContext": "fb512b84-7a1a-4fb4-8bd2-ac7d2ec45984", 97 | "inputs": ["what is Dapr", "Why use Dapr"], 98 | "parameters": {}, 99 | } 100 | 101 | RESPONSE = { 102 | "conversationContext": "fb512b84-7a1a-4fb4-8bd2-ac7d2ec45984", 103 | "outputs": { 104 | { 105 | "result": "Dapr is distribution application runtime ...", 106 | "parameters": {}, 107 | }, 108 | { 109 | "result": "Dapr can help developers ...", 110 | "parameters": {}, 111 | } 112 | 113 | }, 114 | } 115 | ``` 116 | 117 | > Note: URL will begin with `/v1.0-alpha1` while in preview 118 | -------------------------------------------------------------------------------- /20240917-BR-resiliency-error-code-retries.md: -------------------------------------------------------------------------------- 1 | # Resiliency Policy Error Code Retries 2 | 3 | * Author(s): Anton Troshin (@antontroshin), Taction (@taction) 4 | * Updated: 2024-09-18 5 | 6 | ## Overview 7 | 8 | This is a design proposal to provide additional functionality for Dapr Resiliency Policy Retries to be able to enforce policy only on specific response error codes. 9 | It only focuses on the `retries` (https://docs.dapr.io/operations/resiliency/policies/#retries) part of the policy. 10 | 11 | ## Background 12 | 13 | In some applications, some status codes may be used to indicate the business error, and retrying the operation might not be necessary or otherwise desirable. 14 | Customizing retry behavior will allow a more granular way to handle error codes that suit each use case. 15 | Currently, all errors are retried when the policy is applied. 16 | Some status codes are not retryable, and subsequent calls will result in the same error. Avoiding these retry calls will reduce the overall number of requests, traffic, and errors. 17 | 18 | ## Related Items 19 | 20 | https://github.com/dapr/dapr/issues/6683 21 | https://github.com/dapr/dapr/issues/6428 22 | https://github.com/dapr/dapr/issues/7697 23 | 24 | PR: 25 | https://github.com/dapr/dapr/pull/7132 26 | 27 | Docs: 28 | https://github.com/dapr/docs/issues/4254 29 | https://github.com/dapr/docs/issues/3859 30 | 31 | ## Expectations and alternatives 32 | 33 | * What is in scope for this proposal? 34 | - HTTP and gRPC Service Invocation, direct and proxied 35 | - Bindings 36 | - Pub/Sub 37 | 38 | ## Implementation Details 39 | 40 | ### Design 41 | 42 | Add a new object field to the `retries` policy Spec to allow the user to specify the status codes that should be retried. 43 | Separate fields for HTTP and gRPC. The new fields should be optional and will default to the existing behavior, which is to retry on all errors. 44 | 45 | ### Example 1: 46 | In this example, the retry policy will retry **_only_** on HTTP 500 and HTTP status code range 502-504 (inclusive) and gRPC status code range 2-4 (inclusive). 47 | The rest of the status codes will not be retried. 48 | 49 | ```yaml 50 | apiVersion: dapr.io/v1alpha1 51 | kind: Resiliency 52 | metadata: 53 | name: myresiliency 54 | scopes: 55 | - app1 56 | spec: 57 | policies: 58 | retries: 59 | pubsubRetry: 60 | policy: constant 61 | duration: 5s 62 | maxRetries: 10 63 | matching: 64 | httpStatusCodes: "500,502-504" 65 | gRPCStatusCodes: "2-4" 66 | ``` 67 | 68 | ### Example 2: 69 | In this example, the retry policy will retry **_only_** on gRPC status code range 1-15 (inclusive). 70 | However, this policy will not apply to the HTTP status codes, and they will be retried according to the default behavior, which is to retry on all errors. 71 | 72 | ```yaml 73 | apiVersion: dapr.io/v1alpha1 74 | kind: Resiliency 75 | metadata: 76 | name: myresiliency 77 | scopes: 78 | - app1 79 | spec: 80 | policies: 81 | retries: 82 | pubsubRetry: 83 | policy: constant 84 | duration: 5s 85 | maxRetries: 10 86 | matching: 87 | gRPCStatusCodes: "1-15" 88 | ``` 89 | 90 | ### Acceptable Values 91 | The acceptable values are the same as the ones defined in the [HTTP Status Codes](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status) and [gRPC Status Codes](https://grpc.io/docs/guides/status-codes/) documentation. 92 | 93 | - HTTP: from 100 to 599 94 | - gRPC: from 1 to 16 95 | 96 | ### Setting Format 97 | Both the `httpStatusCodes` and `gRPCStatusCodes` fields are of type string and optional and can be set to a comma-separated list of status codes and/or ranges of status codes. 98 | The range must be in the format `-` (inclusive). Having more than one dash in the range is not allowed. 99 | 100 | ### CRD Validation 101 | 102 | Both field values should be validated using Common Expression Language [CEL](https://kubernetes.io/docs/reference/using-api/cel/) 103 | In addition, see Kubebuilder documentation for [CRD Validation](https://book.kubebuilder.io/reference/markers/crd-validation) 104 | 105 | ### Parsing the configuration 106 | 107 | The configuration values will be first parsed as comma-separated lists. 108 | Each entry in the list will be then parsed as a single status code or a range of status codes. 109 | Invalid entries will be logged and the Dapr runtime will fail to start. 110 | 111 | Example: 112 | 113 | ```yaml 114 | apiVersion: dapr.io/v1alpha1 115 | kind: Resiliency 116 | metadata: 117 | name: myresiliency 118 | scopes: 119 | - app1 120 | spec: 121 | policies: 122 | retries: 123 | pubsubRetry: 124 | policy: constant 125 | duration: 5s 126 | maxRetries: 10 127 | matching: 128 | httpStatusCodes: "500,502-504,15,404-405-500,-1,0," 129 | ``` 130 | The steps to parse the configuration are: 131 | 1. Split the `httpStatusCodes` configuration string `"500,502-504,15,404-405-500,-1,0,"` by the comma character resulting in the following list: `["500", "502-504", "15", "404-405-500", "-1", "0"]` ignoring the empty strings. 132 | 2. For each entry in the list, parse it as a single status code or a range of status codes. 133 | 3. If the entry is a single status code, add it to the list of status codes to retry. 134 | 4. If the entry is a range of status codes (each field for the relevant HTTP or gRPC status codes), add all the status codes in the range to the list of status codes to retry. 135 | - 500 is **valid** code for HTTP 136 | - 502-504 **valid** range of codes for HTTP 137 | - 15 is **invalid** code for HTTP, error logged and application will fail to start 138 | - 404-405-500 is **invalid** range contains more than one dash, error logged and application will fail to start 139 | - -1 is ignored is **invalid** code for HTTP, error logged and application will fail to start 140 | - 0 is ignored is **invalid** code for HTTP, error logged and application will fail to start 141 | 142 | ### Acceptance Criteria 143 | 144 | Integration and unit tests will be added to verify the new functionality. 145 | 146 | ## Completion Checklist 147 | 148 | * Code changes 149 | * Tests added (e2e, unit) 150 | * Documentation 151 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Dapr Proposals 2 | 3 | ## Introduction 4 | 5 | This repository stores proposals and designs for new features in Dapr (i.e not bug fixes or minor changes) with the intention of improving visibility, historical record-keeping and maintaining a consistent process. 6 | 7 | ### What types of changes warrant a proposal here? 8 | 9 | As mentioned above, any significant change that needs design and a conversation around that design should go here. As a guideline, anything that would warrant a change in the Dapr SDKs would probably require a proposal. Some specific examples would include: 10 | 11 | * New Dapr building blocks 12 | * New APIs or breaking API changes (especially to a non-alpha component) 13 | 14 | ## How do I create a proposal? 15 | 16 | 1. Create a fork of this repository. 17 | 2. Copy the proposal template [templates/proposal.md](templates/proposal.md) following the format outlined below. 18 | 3. Edit the template, filling it in with the proposal (for guides, see information in `guides/`) 19 | 5. Submit a PR to `dapr/proposals` for community review. 20 | 21 | ## Proposal name format 22 | 23 | Proposal file are named in the following format: 24 | 25 | > `YYYYMMDD-FLAGS-description.md` 26 | 27 | Where *YYYY* is a 4-digit year, MM for 2-digit month and DD for 2-digit day of when the proposal was last updated (like `20240309`, for example), and *FLAGS* is one (or possibly more) of: 28 | 29 | * B - Building block change / creation 30 | * C - Components change / creation 31 | * I - Affects Dapr CLI 32 | * P - The proposal Process itself 33 | * R - Runtime 34 | * S - Affects SDKs 35 | 36 | So, for example, a proposal to create a new building block, such as the workflow building block, might be something like `20240102-BRS-workflow-building-block.md`, whereas a change to the actor system, which does not require any changes to the SDKs themselves, would be something like `20240103-R-actor-reminder-system.md` 37 | 38 | ## Proposal Process 39 | 40 | * The proposal will be opened as a PR against this repository 41 | * Proposal will be reviewed by the community and the author(s) of the proposal 42 | * The author(s) address questions/comments from the community in the proposal and adjust the proposal based on feedback 43 | * Once the feedback phase is complete, and a proposal has been accepted, the proposal will be merged into this repository 44 | * An issue needs to be created in dapr/dapr created from the template in [templates/lifecycle.md](templates/lifecycle.md) to track the work that needs to be done to implement this proposal 45 | * Release of the feature will be slated for a specific release version of Dapr 46 | 47 | ### Proposal acceptance 48 | 49 | To accept a proposal, the maintainers of the relevant repository must vote using comments on the respective PR. A proposal is accepted by a majority vote supporting the proposal. When this condition is true, a maintainer from the relevant repository may approve and merge the proposal PR. While everyone is encouraged to participate and drive feedback, only the maintainers of the relevant repository have binding votes. Maintainers of other repositories and community contributors can cast non-binding votes to show support. The majority vote needed is a simple majority (more than 50% of total votes). 50 | 51 | ## Feature lifecycle outline 52 | 53 | Features in Dapr have a lifecycle (e.g [Components](https://docs.dapr.io/operations/components/certification-lifecycle/)) and, as such, should have a defined set of milestones / requirements for progression between the lifecycle phases. For example, can a user expect from a feature when it is Alpha quality? Once that is released, what is the plan to progress from Alpha to Beta, and the subsequent expectations? What is the expectation when this feature becomes Stable? It is important to identify what functionality or perfomance guarantees we are making to users of Dapr when adding something new. 54 | 55 | For example, the lifecycle expectations of a "Foobar API" that is going to replace an existing API might look something like: 56 | 57 | Alpha: 58 | * Initial contract for the Foobar API is complete 59 | * Performance is expected to be >10TPS 60 | * Will not support serialization via XML 61 | * Data stored will not be compatible with old API, existing data will be unavailable through this API (will need to use old API to access old data) 62 | * Only available when feature flag `foobar-api` is enabled 63 | * No migration of existing data from the old API available 64 | 65 | Beta: 66 | * Performance meets or exceeds 1,000TPS 67 | * Enabled by default, users can opt-out via feature flag / configuration 68 | * Existing APIs marked as deprecated 69 | * Migration from previous data source / format can be done manually 70 | * XML will be supported 71 | * Backwards-incompatible changes may be made 72 | 73 | 74 | Stable: 75 | * Enabled by default, existing APIs have been removed fully 76 | * Documentation has been changed to remove previous API definitions 77 | * Migration from previous data source / format will be done automatically (lazily) 78 | * API is stable and changes will not be backwards-incompatible 79 | 80 | 81 | 82 | ## Proposal Language 83 | 84 | This information can be included either in the template or in a README -- and is designed to provide a common language for proposals so that the expectations are clear. 85 | 86 | 87 | ### Terminology 88 | 89 | _(This is an incomplete list and should / will be expanded over time)_ 90 | 91 | | Term | Meaning | 92 | |------|-------------------------------------------------------------------------------------------------------------------------------------------------------------| 93 | | Building block | Capabilities that solve common developmental challenges in building distributed applications | 94 | | API | Application Programming Interface - functionality exposed to end-users that can be used to interact with Dapr's building blocks in the application they are building | 95 | | Feature | New or enhanced functionality that is being added to Dapr | 96 | 97 | ### Keywords 98 | 99 | The keywords “MUST”, “MUST NOT”, "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", 100 | and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. 101 | 102 | -------------------------------------------------------------------------------- /guides/api-design.md: -------------------------------------------------------------------------------- 1 | # API Design Guidelines 2 | 3 | * Authors: Mukundan Sundararajan (@mukundansundar), John Ewart (@johnewart) 4 | * Updated: 10/25/2022 5 | 6 | ## Proposal requirements 7 | 8 | 9 | For any API (new or updates), the following must be included in the proposal: 10 | 11 | * Relevant high level design 12 | * Proposed contract for the API 13 | * HTTP and gRPC APIs should be consistent in behavior and user experience. 14 | * Identifying what additions to existing components / creation of new components are required for this API (if any) 15 | * Scope for current and following releases (i.e what can be expected from this iteration and what is being pushed down the road) 16 | * Known limitations, where applicable 17 | * Performance issues 18 | * Compatibility issues 19 | * Code examples (pseudocode is acceptable) 20 | 21 | 22 | ## API Lifecycle expectations 23 | 24 | APIs are expected to go through three stages in their lifetime: Alpha, Beta and Stable. For each of these phases it should be clear to a user what they can expect. In the case of an API, those expectations are: 25 | 26 | * **Alpha** 27 | * API is not production ready yet and might contain bugs 28 | * Recommended for non-business-critical use only because of potential for incompatible changes in subsequent releases 29 | * May not be backwards-compatible with an API it intends to replace 30 | * May not be highly performant or support all SDKs 31 | 32 | * **Beta** 33 | * API is not production ready yet 34 | * If an API moves into Beta, the intention is that it will continue on to become stable and not be removed 35 | * Multiple components implement the API and API contract is mostly finalized 36 | * Recommended for non-business-critical use only due to potentially backwards-incompatible changes in subsequent releases 37 | * Should have support in (at least) the _"core"_ SDKs _(i.e. Python, Go, Java)_ 38 | * Performance should be production-ready but may not be in all cases 39 | 40 | * **Stable** 41 | * API will not undergo backwards-incompatible changes 42 | * API is considered ready for production usage 43 | * Performance numbers are published for the API and there are tests and safeguards in place to prevent regression 44 | 45 | 46 | ## Requirements for API changes 47 | 48 | No matter if the change is a net-new API or an update to an existing API, the following is required: 49 | 50 | * Changes to documentation must be identified and written 51 | * Existing E2E and performance tests must pass 52 | * If a new command/modifications to existing command is required to facilitate ease of use of the new API, related code must be added to the Dapr CLI 53 | 54 | ### Creation of new APIs 55 | 56 | All new APIs that are defined start at the Alpha stage. 57 | 58 | * Both HTTP and gRPC protocols should be supported for the new API 59 | * Documentation must be provided for the API 60 | * HTTP API must be added to the `Reference` section in the Dapr documentatio 61 | * Issues should be added in `dapr/quickstarts` to create examples for the new API to enable users to easily explore the new functionality provided by the API 62 | * If the new API is considered an _optimization_ of an existing API (say, the addition of `BulkGetSecrets` alongside `GetSecret`) then: 63 | * The performance improvement gained due to this API should be documented 64 | * Guidance must be provided to the users in docs as to when to use this API vs using the older one 65 | * Performance tests should (though preferably must) exist for this new API 66 | * _Should_ include new E2E tests that exercise the API 67 | 68 | 69 | ### Updates to existing APIs 70 | 71 | Depending on the phase of the existing API, the proposed changes may or may not be backwards-incompatible 72 | 73 | _Backwards-**incompatible** changes_ 74 | 75 | * May _only_ be proposed to Alpha or Beta APIs 76 | * Require updates to existing E2E tests to support these changes 77 | * Breaking changes to existing Alpha or Beta APIs must be tracked and updated in docs/release notes 78 | 79 | _Backwards-**compatible** changes_ 80 | 81 | * May be proposed to _any_ API 82 | * Proposed changes to both the HTTP and gRPC API must be included 83 | 84 | 85 | ## Requirements for Building Block changes 86 | 87 | Finally on addition of a new API, there may be addition of the capability to either an existing component or if it is a new building block, creation of a new set of components in the `dapr/components-contrib` repo. 88 | 89 | ### Creating new API as part of a new building block in `dapr/components-contrib`** 90 | 91 | - Interfaces to be used by `dapr/dapr` code must be defined and agreed upon 92 | - New building block package is defined in `components-contrib` repo, new code must only be added inside that building block package 93 | - Conformance tests enable validating the components compliance with defined interface for the building block and creates a baseline for conformance testing any new components added. Conformance tests may be added for the new API with the understanding that it may evolve 94 | 95 | 96 | ### Creating new API for an existing building block in `dapr/components-contrib` 97 | 98 | - Interfaces changes for the new API must be defined and agreed upon 99 | - Existing components that support the new API must be enhanced to be in compliance with the proposed interface as per the defined and agreed upon scope of the original proposal 100 | - Conformance tests must be updated 101 | - Get sign off on a basic suite of conformance tests for the interface method(s) 102 | - Implement the suite of conformance tests as part of the existing suite of tests for the building block 103 | - Ensure successful execution of existing conformance and certification tests for any modified components 104 | 105 | 106 | 107 | ## Progression of an API/Building block 108 | 109 | ### Alpha to Beta 110 | 111 | In addition to the requirements that are required of any Alpha API, the following requirements must be met so that the API can graduate to Beta. For an API to be promoted to Beta, it must exist for at least one release cycle after its initial addition as Alpha. (i.e something added in 1.10 could become Beta in 1.12, having been stabilized through 1.11) 112 | 113 | For all APIs, the following criteria need to be met: 114 | 115 | * E2E test with extensive positive and negative scenarios must be defined 116 | * Most (if not all) changes needed in the user facing structures must be considered to be complete (in order to reduce the number of breaking changes) 117 | * All _"core"_ SDKs must have support for this API _(i.e. Python, Go, .NET, Java)_ 118 | * Documentation of the API must be completely up-to-date 119 | * Quickstarts must be defined for the API allowing users to quickly explore the API 120 | * Performance tests should be added (if not already available in Alpha stage) / updated where relevant 121 | 122 | 123 | For **building blocks** to progress, the following criteria are required: 124 | 125 | * Conformance test(s) must be added(in case a new building block does not have conformance tests in the Alpha stage)/updated 126 | * Conformance tests must test both positive and negative cases (i.e deliberately attempt to break them) 127 | * Certification tests should be added to the different components and this API must be exercised in the certification tests 128 | * Multiple implementations must be present for this building block 129 | 130 | ### Beta to Stable 131 | 132 | In addition to the requirements for a Beta API, the following requirements must be met so that the API can graduate to Stable. Similar to the previous phase change, this API must have been in the Beta phase for at least one full release _without any breaking changes_. In addition, the following criteria apply: 133 | 134 | * E2E scenarios must be well defined and comprehensive 135 | * Performance tests must be added(in case a new building block does not have performance tests in the Alpha/Beta stage)/updated 136 | * Expected performance data must be added to documentation 137 | 138 | For **building blocks** to progress, the following must also be true: 139 | 140 | * E2E tests must exercise _at least two different implementations_ of the building block's API 141 | * Conformance tests testing both positive and negative cases must be defined 142 | * Certification tests for multiple components implementing this API must be defined 143 | -------------------------------------------------------------------------------- /resources/0004-BIRS-distributed-scheduler/bigPicture.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dapr/proposals/0a13058ce6dce66c90bbaea0c692c88ccacf3be1/resources/0004-BIRS-distributed-scheduler/bigPicture.png -------------------------------------------------------------------------------- /resources/0004-BIRS-distributed-scheduler/pluggableSchedulerService.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dapr/proposals/0a13058ce6dce66c90bbaea0c692c88ccacf3be1/resources/0004-BIRS-distributed-scheduler/pluggableSchedulerService.png -------------------------------------------------------------------------------- /resources/0004-BIRS-distributed-scheduler/publicDaprAPI.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dapr/proposals/0a13058ce6dce66c90bbaea0c692c88ccacf3be1/resources/0004-BIRS-distributed-scheduler/publicDaprAPI.png -------------------------------------------------------------------------------- /resources/0004-BIRS-distributed-scheduler/sidecarToSchedulerComm.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dapr/proposals/0a13058ce6dce66c90bbaea0c692c88ccacf3be1/resources/0004-BIRS-distributed-scheduler/sidecarToSchedulerComm.png -------------------------------------------------------------------------------- /resources/0004-BIRS-distributed-scheduler/watchJobsFlow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dapr/proposals/0a13058ce6dce66c90bbaea0c692c88ccacf3be1/resources/0004-BIRS-distributed-scheduler/watchJobsFlow.png -------------------------------------------------------------------------------- /resources/20221130-I-enhance-dapr-run-multiple-apps/interaction-flow-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dapr/proposals/0a13058ce6dce66c90bbaea0c692c88ccacf3be1/resources/20221130-I-enhance-dapr-run-multiple-apps/interaction-flow-1.png -------------------------------------------------------------------------------- /resources/20230327-RCBS-Crypto-building-block/data-flow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dapr/proposals/0a13058ce6dce66c90bbaea0c692c88ccacf3be1/resources/20230327-RCBS-Crypto-building-block/data-flow.png -------------------------------------------------------------------------------- /resources/README.md: -------------------------------------------------------------------------------- 1 | ## What's in here? 2 | 3 | Any files related to proposals (images, attachments, etc.) that are needed to embed into a proposal should go in here. 4 | 5 | Assets should be kept in a directory that matches the proposal name - i.e `20230114-R-foo-bar.md` should store its assets in `resources/20230114-R-foo-bar/` 6 | 7 | 8 | -------------------------------------------------------------------------------- /templates/lifecycle.md: -------------------------------------------------------------------------------- 1 | # Feature name 2 | 3 | > If you need more information on what needs to be completed, look in the `guides` directory for relevant guidance. 4 | 5 | # Links 6 | 7 | Links to any relevant resources go here: 8 | 9 | * Relevant proposal 10 | * Existing issues 11 | * Milestones 12 | 13 | # Lifecycle Expectations 14 | 15 | ## Alpha / Beta / Stable 16 | 17 | For each stage, identify the expectations of this feature at that stage. For example, 18 | are there any performance issues, configuration changes or feature deprecation that will happen? 19 | 20 | * Anticipated performance / known limitations 21 | * Compatability guarantees / requirements 22 | * Deprecation / co-existence with existing functionality 23 | * Feature flags required 24 | 25 | # Acceptance criteria 26 | 27 | > For each of the stages, add any specific tasks that need to be completed before this feature reaches that particular stage. If a particular item is *not needed* then a reason should be given. 28 | 29 | ## Alpha 30 | 31 | - [ ] Minimum of one core SDK supports this feature (.NET / Python / Go / Java) 32 | - [ ] Feature documentation added to `dapr/docs` 33 | - [ ] Telemetry data (metrics) available for this feature 34 | - [ ] Issue opened in `dapr/quickstarts` for quickstart examples to be created 35 | 36 | Additionally, for **APIs**: 37 | 38 | - [ ] Both HTTP and gRPC protocols implemented 39 | - [ ] HTTP API documentation added to the `Reference` section of Dapr documentation 40 | 41 | Additionally, for **building blocks**: 42 | 43 | - [ ] Interfaces to be used by `dapr/dapr` code defined and agreed upon 44 | - [ ] New building block package is defined in `components-contrib` repo 45 | - [ ] Conformance tests validating the components compliance added 46 | - [ ] Minimum of _one_ implementation (preferably something already in-use such as Redis if possible to reduce complexity) 47 | 48 | 49 | ## Beta 50 | 51 | - [ ] E2E tests are up-to-date and comprehensive 52 | - [ ] SDK spec is updated 53 | - [ ] No major changes to the API have occurred in the last XXX time period (releases? months?) 54 | - [ ] Support in core SDKs 55 | - [ ] Python 56 | - [ ] Go 57 | - [ ] Java 58 | - [ ] .NET 59 | - [ ] JavaScript 60 | - [ ] Documentation up-to-date with any new changes since Alpha 61 | - [ ] Quickstarts have been created 62 | - [ ] Performance tests exist but do not block builds 63 | 64 | Additionally, for **building blocks**: 65 | 66 | - [ ] Conformance tests updated to match any API changes that have been made 67 | - [ ] Conformance tests exercise both positive and negative cases 68 | - [ ] Minimum of N (three?) implementations of this building block 69 | - [ ] Certification tests for implementations 70 | - [ ] APIs that are used in the building block also meet Beta criteria 71 | 72 | 73 | ## Stable 74 | 75 | 76 | - [ ] Documentation is complete in `dapr/docs` with any changes since Beta 77 | - [ ] E2E scenarios well defined and comprehensive 78 | - [ ] Performance tests exist and regressions will prevent them from successfully passing 79 | - [ ] Performance data added to documentation (https://docs.dapr.io/operations/performance-and-scalability/) 80 | 81 | -------------------------------------------------------------------------------- /templates/proposal.md: -------------------------------------------------------------------------------- 1 | # Title of proposal 2 | 3 | * Author(s): [Author Name, Co-Author Name ...] 4 | * State: {Ready for Implementation, Implemented} 5 | * Updated: [Date] 6 | 7 | ## Overview 8 | 9 | A brief description of the proposal; include information such as: 10 | 11 | * What areas are affected by this change? 12 | * What is being proposed in this document? 13 | 14 | ## Background 15 | 16 | This section is intended to provide the community with the reasoning behind this proposal -- why is this proposal being made? What problem is it solving for users / developers / operators and how does it solve that for them? 17 | 18 | ## Related Items 19 | 20 | ### Related proposals 21 | 22 | Links to proposals that are related to this (either due to dependency, or possibly because this will replace another proposal) 23 | 24 | ### Related issues 25 | 26 | Please link to any issues that this proposal is related to, for example, are there existing bugs filed in various Dapr repositories that this will affect? 27 | 28 | 29 | ## Expectations and alternatives 30 | 31 | * What is in scope for this proposal? 32 | * What is deliberately *not* in scope? 33 | * What alternatives have been considered, and why do they not solve the problem? 34 | * Are there any trade-offs being made? (space for time, for example) 35 | * What advantages / disadvantages does this proposal have? 36 | 37 | ## Implementation Details 38 | 39 | ### Design 40 | 41 | How will this work, technically? Where applicable, include: 42 | 43 | * Design documents 44 | * System diagrams 45 | * Code examples 46 | 47 | ### Feature lifecycle outline 48 | 49 | * Expectations 50 | * Compatability guarantees 51 | * Deprecation / co-existence with existing functionality 52 | * Feature flags 53 | 54 | ### Acceptance Criteria 55 | 56 | How will success be measured? 57 | 58 | * Performance targets 59 | * Compatibility requirements 60 | * Metrics 61 | 62 | ## Completion Checklist 63 | 64 | What changes or actions are required to make this proposal complete? Some examples: 65 | 66 | * Code changes 67 | * Tests added (e2e, unit) 68 | * SDK changes (if needed) 69 | * Documentation 70 | 71 | --------------------------------------------------------------------------------