├── .gitignore
├── 0012-BIRS-distributed-scheduler.md
├── 0013-RS-pubsub-subscription-streaming.md
├── 20221025-P-proposal-process.md
├── 20221121-R-pluggable-components-injector.md
├── 20221130-I-enhance-dapr-run-multiple-apps.md
├── 20230327-RCBS-Crypto-building-block.md
├── 20230406-B-external-service-invocation.md
├── 20230511-BCIRS-error-handling-codes.md
├── 20230627-P-proposal-sdk-approval.md
├── 20230714-S-sdk-resiliency.md
├── 20230918-S-unified-api-token-env-variable.md
├── 20231024-CIR-trust-distribution.md
├── 20240508-S-sidecar-endpoint-tls.md
├── 20240517-R-http-metrics-path-matching.md
├── 20240618-RCBS-Conversation-building-block.md
├── 20240917-BR-resiliency-error-code-retries.md
├── LICENSE
├── README.md
├── guides
    └── api-design.md
├── resources
    ├── 0004-BIRS-distributed-scheduler
    │   ├── bigPicture.png
    │   ├── pluggableSchedulerService.png
    │   ├── publicDaprAPI.png
    │   ├── sidecarToSchedulerComm.png
    │   └── watchJobsFlow.png
    ├── 20221130-I-enhance-dapr-run-multiple-apps
    │   └── interaction-flow-1.png
    ├── 20230327-RCBS-Crypto-building-block
    │   └── data-flow.png
    └── README.md
└── templates
    ├── lifecycle.md
    └── proposal.md


/.gitignore:
--------------------------------------------------------------------------------
 1 | /dist
 2 | .idea
 3 | **/.DS_Store
 4 | 
 5 | github.com/
 6 | 
 7 | .vscode
 8 | 
 9 | # Visual Studio 2015/2017/2019 cache/options directory
10 | .vs/
11 | /vendor
12 | **/*.log
13 | **/.project
14 | **/.factorypath
15 | google
16 | 
17 | test_report*
18 | 
19 | # Go Workspaces (introduced in Go 1.18+)
20 | go.work


--------------------------------------------------------------------------------
/0012-BIRS-distributed-scheduler.md:
--------------------------------------------------------------------------------
  1 | # Distributed Scheduler Building Block and Service
  2 | 
  3 | * Author(s): 
  4 |     * Cassie Coyle (@cicoyle)
  5 |     * Yaron Schneider (@yaron2)
  6 |     * Artur Souza (@artursouza)
  7 | * State: Ready for Review
  8 | * Updated: 2024-05-28
  9 | 
 10 | ## Overview
 11 | 
 12 | This design proposes 2 additions:
 13 | - A Distributed Scheduler API Building Block
 14 | - A Distributed Scheduler Control Plane Service
 15 | 
 16 | ## Description
 17 | 
 18 | A distributed scheduler is a system that manages the scheduling and orchestration of jobs across a distributed computing environment at specified times or intervals.
 19 | 
 20 | ## Motivation
 21 | 
 22 | Dapr users have a need for a distributed scheduler. The idea is to have an *orchestrator* for scheduling jobs in the future either at a specific time or a specific interval.
 23 | 
 24 | Examples include:
 25 | - Scalable actor reminders
 26 | - Scheduling any Dapr API to run at specific times or intervals. For example sending Pub/Sub messages, calling service invocations, input bindings, saving state to a state store. 
 27 | 
 28 | ## Goals
 29 | 
 30 | Implement a change into `dapr/dapr` that facilitates a seamless experience allowing for the scheduling of jobs across API building blocks using a new scheduler API building block and control plane service. The Scheduler Building Block is a job orchestrator, not executor. The design guarantees *at least once* job execution with a bias towards durability and horizontal scaling over precision. This means we **can** guarantee that a job will never be invoked *before* the schedule is due, but we **cannot** guarantee a ceiling time on when the job is invoked *after* the due time is reached.
 31 | 
 32 | ## Non-Goals
 33 | 
 34 | - Retry logic 
 35 |     - From the Scheduler Service to the Sidecar. A new 'sidecar' target will be added to available targets such that users can configure the Dapr Resiliency Policies. 
 36 | - Deep observability into jobs in control plane
 37 |     - Things beyond basic ListJobs. This might entail things like the history of prior triggered jobs or future jobs.
 38 | - Applications to do REST on Jobs in other namespaces. Currently, jobs will be namespaced to the app/sidecar namespace.
 39 | 
 40 | ## Current Shortfalls
 41 | 
 42 | The **Workflows** building block is built on top of Actor Reminders, which have scale limitation issues today. The goal is to improve the performance and scale of Actor Reminder by using the distributed scheduler.
 43 | 
 44 | Currently, Dapr users are able to use the **Publish and Subscribe** building block, but are unable to have delayed PubSub scheduling. This scheduler  service enables users to publish a message in a future specific time , for example a week from today or a specific UTC date/time.
 45 | 
 46 | For **Service Invocation**, this building block could also benefit from a scheduler in that it would enable the scheduling of method calls between applications.
 47 | 
 48 | As of now, Dapr does have an **input cron binding** component, which can allow users to schedule tasks. This requires the component yaml file, where users can listen on an endpoint that is scheduled. This is limited to being an input binding only. The Scheduler Service will enable the scheduling of jobs to scale across multiple replicas, while guaranteeing that a job will only be triggered by 1 Scheduler Service instance.
 49 | 
 50 | *Note:* Performance is the primary focus while implementing this feature given the current shortfalls.
 51 | 
 52 | ## Big Picture Idea
 53 | 
 54 | ![Big Picture](./resources/0004-BIRS-distributed-scheduler/bigPicture.png)
 55 | 
 56 | If a user would like to store their user associated data in a specific state store of their choosing, then they can provision a state store using the Dapr State Management Building Block and set `jobStateStore ` as `true` in the state store component’s metadata section. Having the `jobStateStore` set to `true` means that their user associate data will be stored in the state store of their choosing, but their job details will still be stored in the embedded etcd. If the `jobStateStore` is not configured, then the embedded etcd will be used to store both the job details and the user associated data.
 57 | 
 58 | *Note:* The Scheduler functionality is usable by both Standalone (Self-Hosted) and Kubernetes modes.
 59 | 
 60 | ## Implementation Details
 61 | 
 62 | ### Building Block
 63 | 
 64 | #### Scenarios
 65 | 
 66 | ##### Example Usage
 67 | 
 68 | Users will have a job they would like scheduled. For example, an application performs a daily backup of their database. This backup task should run every day at a specific time to ensure data integrity. The user calls to schedule their job
 69 | using the new Dapr Scheduler Building Block.
 70 | 
 71 | Example JSON (shown below) that you can use to schedule a job by making a request to `http://localhost:<daprPort>/v1.0/schedule/jobs/prd-db-backup`. This request schedules a job named `prd-db-backup` to run daily for the purpose of performing a database backup. The `@daily` schedule specification indicates that the job will run once a day, specifically at midnight (for more details, refer to the Schedule table below).
 72 | 
 73 | Note: This is an example to illustrate intent. The fields are purposeful for this example, and data can take any form for a job. 
 74 | ```Json
 75 | {
 76 |   "schedule": "@daily",
 77 |   "data": {
 78 |     "task": "db-backup",
 79 |     "metadata": {
 80 |       "db_name": "my-prod-db",
 81 |       "backup_location": "/backup-dir"
 82 |     }
 83 |   }
 84 | }
 85 | ```
 86 | 
 87 | Potential `dapr/go-sdk` example code:
 88 | ```go
 89 | import(
 90 |     ...
 91 |     schedulerapi "github.com/dapr/dapr/pkg/proto/scheduler/v1"
 92 |     ...
 93 | )
 94 | ...
 95 | 
 96 | type Metadata struct {
 97 |     DBName         string `json:"db_name"`
 98 |     BackupLocation string `json:"backup_location"`
 99 | }
100 | 
101 | type DBBackup struct {
102 |     Task     string   `json:"task"`
103 |     Metadata Metadata `json:"metadata"`
104 | }
105 | 
106 | func main() {
107 | ...
108 | 
109 |     // Define a job to be scheduled
110 |     job := &schedulerapi.Job{
111 |         Name:     "prd-db-backup",
112 |         Schedule: "@daily",
113 |         Data: &ptypes.Any{
114 |             Value: &DBBackup{
115 |                 Task: "db-backup",
116 |                 Metadata: Metadata{
117 |                     DBName:         "my-prod-db",
118 |                     BackupLocation: "/backup-dir",
119 |                 },
120 |             },
121 |         },
122 |     }
123 | 
124 |     // Schedule a job
125 | 	scheduleJobRequest := &schedulerapi.ScheduleJobRequest{
126 | 		Job:          job,
127 | 	}
128 | 
129 |     err = client.ScheduleJobAlpha1(context.Background(), scheduleJobRequest)
130 | 	if err != nil {
131 | 		fmt.Printf("Error scheduling job: %v\n", err)
132 | 	}
133 | 
134 | 	// Get a job by name
135 | 	getJobRequest := &schedulerapi.GetJobRequest{
136 | 		Name: "prd-db-backup",
137 | 	}
138 | 
139 | 	response, err := client.GetJobAlpha1(context.Background(), getJobRequest)
140 | 	if err != nil {
141 | 		fmt.Printf("Error getting job: %v\n", err)
142 | 	} else {
143 | 		job := response.Job
144 | 		fmt.Printf("Got job: %v\n", job)
145 | 	}
146 | 
147 | 	// List all jobs by app_id
148 | 	listJobsRequest := &schedulerapi.ListJobsRequest{
149 | 		AppID: "your-app-id",
150 | 	}
151 | 
152 | 	// List to be added after 1.14 release
153 | 	listResponse, err := client.ListJobsAlpha1(context.Background(), listJobsRequest)
154 | 	if err != nil {
155 | 		fmt.Printf("Error listing jobs: %v\n", err)
156 | 	} else {
157 | 		jobs := listResponse.Jobs
158 | 		fmt.Printf("List of jobs: %v\n", jobs)
159 | 	}
160 | 
161 | 	// Delete a job by name
162 | 	deleteJobRequest := &schedulerapi.DeleteJobRequest{
163 | 		Name: "prd-db-backup",
164 | 	}
165 | 
166 | 	err = client.DeleteJobAlpha1(context.Background(), deleteJobRequest)
167 | 	if err != nil {
168 | 		fmt.Printf("Error deleting job: %v\n", err)
169 | 	}
170 | ...
171 | }
172 | ```
173 | 
174 | ##### Actor Reminders
175 | 
176 | The Scheduler Service will be deployed by default. However, for users to use the Scheduler Service for actor reminders, they will need to explicitly opt in via a preview feature.
177 | 
178 | The interval functionality of the Actor Reminder is similar to the job schedule. With Actor Reminders, a user can specify:
179 | ```json
180 | {
181 |     "dueTime": "10s",
182 |     "period": "R4/PT3S",
183 |     "ttl": "10s"
184 | }
185 | ```
186 | 
187 | Similar logic can be applied to a job in the following manner: 
188 | 
189 | - `dueTime` => The time after which the job is invoked.
190 | - `period` => Is baked into the job parameters and can be showcased below where the job will run 4 times (`repeats`) and auto deletes after 10s (`ttl`)
191 | - `repeats` => The job will run up to the number of `repeats` specified, otherwise if unspecified it runs based on the `schedule` provided until deleted via a `ttl` or a user specified deletion via the APIs
192 | 
193 | ```json
194 | {
195 |     "schedule": "@every 10s",
196 |     "dueTime":"10s",
197 |     "repeats": 4,
198 |     "ttl": "10s"
199 | }
200 | ```
201 | 
202 | The `dueTime` for jobs will follow the same format from Actor Reminders. Supported formats:
203 | ```
204 | RFC3339 date format, e.g. 2020-10-02T15:00:00Z
205 | time.Duration format, e.g. 2h30m
206 | ISO 8601 duration format, e.g. PT2H30M
207 | ```
208 | 
209 | The `ttl` for jobs will follow the same format from Actor Reminders. Supported formats:
210 | ```
211 | RFC3339 date format, e.g. 2020-10-02T15:00:00Z
212 | time.Duration format, e.g. 2h30m
213 | ISO 8601 duration format. Example: PT2H30M
214 | ```
215 | 
216 | ##### Schedule
217 | 
218 | We will be using [this library](https://github.com/diagridio/go-etcd-cron), and will support the following `schedule` format.
219 | 
220 | A cron expression, represents a set of times, using 6 space-separated fields.
221 | 
222 | 	Field name   | Mandatory? | Allowed values  | Allowed special characters
223 | 	----------   | ---------- | --------------  | --------------------------
224 | 	Seconds      | Yes        | 0-59            | * / , -
225 | 	Minutes      | Yes        | 0-59            | * / , -
226 | 	Hours        | Yes        | 0-23            | * / , -
227 | 	Day of month | Yes        | 1-31            | * / , - ?
228 | 	Month        | Yes        | 1-12 or JAN-DEC | * / , -
229 | 	Day of week  | Yes        | 0-6 or SUN-SAT  | * / , - ?
230 | 
231 | A user may use one of several pre-defined schedules in place of a cron expression.
232 | 
233 | Entry                  | Description                                | Equivalent To
234 | -----                  | -----------                                | -------------
235 | @yearly (or @annually) | Run once a year, midnight, Jan. 1st        | 0 0 0 1 1 *
236 | @monthly               | Run once a month, midnight, first of month | 0 0 0 1 * *
237 | @weekly                | Run once a week, midnight on Sunday        | 0 0 0 * * 0
238 | @daily (or @midnight)  | Run once a day, midnight                   | 0 0 0 * * *
239 | @hourly                | Run once an hour, beginning of hour        | 0 0 * * * *
240 | 
241 | Examples of how a user's `schedule` may look:
242 | ```
243 | "0 30 * * * *"
244 | "0 * * 1,15 * Sun"
245 | "@hourly"
246 | "@every 1h30m"
247 | "@daily"
248 | ```
249 | 
250 | #### APIs
251 | 
252 | *Note:* For cases where there are multiple instances of an application trying to write the same job name concurrently, we will follow the [last-write-wins concurrency pattern](https://docs.dapr.io/developing-applications/building-blocks/state-management/howto-stateful-service/?_gl=1*1bpqtb2*_ga*MTg0MDc4OTE4NS4xNjkzMjI0NDIw*_ga_60C6Q1ETC1*MTY5ODA3Mzc2OC4xMTkuMS4xNjk4MDc0NTk1LjAuMC4w#first-write-wins-and-last-write-wins), as used in our state-management Building Block and Actor Reminders.
253 | 
254 | ##### HTTP
255 | 
256 | - Create a scheduled job
257 |     - POST
258 |     - http://localhost:{daprPort}/v1.0-alpha1/schedule/jobs/{name}
259 | 
260 | - Delete a specific job by name
261 |     - DELETE 
262 |     - http://localhost:{daprPort}/v1.0-alpha1/schedule/jobs/{name}
263 | 
264 | - Get a specific job by name
265 |     - GET
266 |     - http://localhost:{daprPort}/v1.0-alpha1/schedule/jobs/{name}
267 | 
268 | - List all jobs for an application
269 |     - GET
270 |     - http://localhost:{daprPort}/v1.0-alpha1/schedule/jobs?appId={app_id}
271 | 
272 | ##### gRPC
273 | 
274 | ![Public Dapr APIs (User Facing)](./resources/0004-BIRS-distributed-scheduler/publicDaprAPI.png)
275 | 
276 | ###### User-Facing APIs
277 | 
278 | ```proto
279 | service Dapr {
280 | …
281 | // Create and schedule a job
282 | rpc ScheduleJobAlpha1(ScheduleJobRequest) returns (google.protobuf.Empty) {}
283 | 
284 | // Get a scheduled job
285 | rpc GetJobAlpha1(GetJobRequest) returns (GetJobResponse) {}
286 | 
287 | // Delete a job
288 | rpc DeleteJobAlpha1(DeleteJobRequest) returns (google.protobuf.Empty) {}
289 | 
290 | // List all jobs by app
291 | rpc ListJobsAlpha1(ListJobsRequest) returns (ListJobsResponse) {}
292 | }
293 | 
294 | 
295 | // Job is the definition of a job.
296 | message Job {
297 |   // The unique name for the job.
298 |   string name = 1;
299 | 
300 |   // The schedule for the job.
301 |   optional string schedule = 2;
302 | 
303 |   // Optional: jobs with fixed repeat counts (accounting for Actor Reminders).
304 |   optional uint32 repeats = 3;
305 | 
306 |   // Optional: sets time at which or time interval before the callback is invoked for the first time.
307 |   optional string due_time = 4;
308 | 
309 |   // Optional: Time To Live to allow for auto deletes (accounting for Actor Reminders).
310 |   optional string ttl = 5;
311 | 
312 |   // Job data.
313 |   google.protobuf.Any data = 6;
314 | }
315 | 
316 | // ScheduleJobRequest is the message to create/schedule the job.
317 | message ScheduleJobRequest {
318 |   // The job details.
319 |   Job job = 1;
320 | }
321 | 
322 | // GetJobRequest is the message to retrieve a job.
323 | message GetJobRequest {
324 |   // The name of the job.
325 |   string name = 1;
326 | }
327 | 
328 | // GetJobResponse is the message's response for a job retrieved.
329 | message GetJobResponse {
330 |   // The job details.
331 |   Job job = 1;
332 | }
333 | 
334 | // DeleteJobRequest is the message to delete the job by name.
335 | message DeleteJobRequest {
336 |   // The name of the job.
337 |   string name = 1;
338 | }
339 | 
340 | // ListJobsRequest is the message to list jobs by app_id.
341 | message ListJobsRequest {
342 |     // The id of the application (app_id) for which to list jobs.
343 |     string app_id = 1;
344 | }
345 | 
346 | // ListJobsResponse is the response message to convey the list of jobs.
347 | message ListJobsResponse {
348 |     // List of jobs that match the request criteria.
349 |     repeated Job jobs = 1;
350 | }
351 | ```
352 | 
353 | ###### Daprd Sidecar to Scheduler Service APIs
354 | 
355 | For the daprd sidecar to Scheduler Service communication, 
356 | ![Scheduler APIs (SideCar Facing)](./resources/0004-BIRS-distributed-scheduler/sidecarToSchedulerComm.png)
357 | 
358 | We will use the same exact protos from the Public Dapr API, but inside a **new**: `dapr/proto/scheduler/scheduler.proto`.
359 | The Schedule/Get/Delete job(s) will be performed via a unary call to the Scheduler Service.
360 | 
361 | There is a bidirectional streaming connection between the daprd sidecar and the Scheduler Service to allow for the acknowledgment of successfully triggered jobs.
362 | 
363 | ###### Scheduler Service APIs
364 | 
365 | In the **new** `dapr/proto/scheduler/scheduler.proto`, the daprd sidecar upon startup will establish a streaming connection with the Scheduler Service such that at the trigger time for a job the Scheduler Service will send that job to the daprd sidecar which is watching for jobs. Then the daprd sidecar will send the job to the app sending the `WatchJobsResponse` back to the Scheduler.
366 | 
367 | ![WatchJobsFLow](./resources/0004-BIRS-distributed-scheduler/watchJobsFlow.png)
368 | 
369 | ```proto
370 | service Scheduler {
371 | 	// ScheduleJob is used by the daprd sidecar to schedule a job.
372 | 	rpc ScheduleJob(ScheduleJobRequest) returns (ScheduleJobResponse) {}
373 | 	
374 | 	// Get a job
375 | 	rpc GetJob(GetJobRequest) returns (GetJobResponse) {}
376 | 	
377 | 	// DeleteJob is used by the daprd sidecar to delete a job.
378 | 	rpc DeleteJob(DeleteJobRequest) returns (DeleteJobResponse) {}
379 | 	
380 | 	// WatchJobs is used by the daprd sidecar to connect to the Scheduler
381 | 	// service to watch for jobs triggering back.
382 | 	rpc WatchJobs(stream WatchJobsRequest) returns (stream WatchJobsResponse) {}
383 | }
384 | 
385 | message Job {
386 |   // The schedule for the job.
387 |   optional string schedule = 1;
388 | 
389 |   // Optional: jobs with fixed repeat counts (accounting for Actor Reminders).
390 |   optional uint32 repeats = 2;
391 | 
392 |   // Optional: sets time at which or time interval before the callback is invoked for the first time.
393 |   optional string due_time = 3;
394 | 
395 |   // Optional: Time To Live to allow for auto deletes (accounting for Actor Reminders).
396 |   optional string ttl = 4;
397 | 
398 |   // Job data.
399 |   google.protobuf.Any data = 5;
400 | }
401 | 
402 | // TargetJob is the message used by the daprd sidecar to schedule a job
403 | // from an App.
404 | message TargetJob {}
405 | 
406 | // TargetActorReminder is the message used by the daprd sidecar to
407 | // schedule a job from an Actor Reminder.
408 | message TargetActorReminder {
409 |   // id is the actor ID.
410 |   string id = 1;
411 | 
412 |   // type is the actor type.
413 |   string type = 2;
414 | }
415 | 
416 | // JobTargetMetadata holds the typed metadata associated with the job for
417 | // different origins.
418 | message JobTargetMetadata {
419 |   oneof type {
420 |     TargetJob job = 1;
421 |     TargetActorReminder actor = 2;
422 |   }
423 | }
424 | 
425 | // JobMetadata is the message used by the daprd sidecar to schedule/get/delete a
426 | // job.
427 | message JobMetadata {
428 |   // app_id is the App ID of the requester.
429 |   string app_id = 1;
430 | 
431 |   // namespace is the namespace of the requester.
432 |   string namespace = 2;
433 | 
434 |   // target is the type of the job.
435 |   JobTargetMetadata target = 3;
436 | }
437 | 
438 | // WatchJobsRequest is the message used by the daprd sidecar to connect to the
439 | // Scheduler and send Job process results.
440 | message WatchJobsRequest {
441 |   oneof watch_job_request_type {
442 |     WatchJobsRequestInitial initial = 1;
443 |     WatchJobsRequestResult result = 2;
444 |   }
445 | }
446 | 
447 | // WatchJobsRequestInitial is the initial request to start watching for jobs.
448 | message WatchJobsRequestInitial {
449 |   // app_id is the App ID of the requester.
450 |   string app_id = 1;
451 | 
452 |   // namespace is the namespace of the requester.
453 |   string namespace = 2;
454 | 
455 |   // actor_types is the optional list of actor types to watch for.
456 |   repeated string actor_types = 3;
457 | }
458 | 
459 | // WatchJobsRequestResult is the result of a job execution to allow the job to
460 | // be marked as processed.
461 | message WatchJobsRequestResult {
462 |   // uuid is the uuid of the job that has finished processing.
463 |   uint64 uuid = 1;
464 | }
465 | 
466 | // WatchJobsResponse is the response message to convey the details of a job.
467 | message WatchJobsResponse {
468 |   // name is the name of the job which was triggered.
469 |   string name = 1;
470 | 
471 |   // uuid is the uuid of the job trigger event which should be sent back from
472 |   // the client to be marked as processed.
473 |   uint64 uuid = 2;
474 | 
475 |   // Job data.
476 |   google.protobuf.Any data = 3;
477 | 
478 |   // The metadata associated with the job.
479 |   JobMetadata metadata = 4;
480 | }
481 | 
482 | message ScheduleJobRequest {
483 |   // name is the name of the job to create.
484 |   string name = 1;
485 | 
486 |   // The job to be scheduled.
487 |   Job job = 2;
488 | 
489 |   // The metadata associated with the job.
490 |   JobMetadata metadata = 3;
491 | }
492 | 
493 | message ScheduleJobResponse {
494 |   // Empty as of now
495 | }
496 | 
497 | // GetJobRequest is the message used by the daprd sidecar to delete or get a job.
498 | message GetJobRequest {
499 |   // name is the name of the job.
500 |   string name = 1;
501 | 
502 |   // The metadata associated with the job.
503 |   JobMetadata metadata = 2;
504 | }
505 | 
506 | // GetJobResponse is the response message to convey the details of a job.
507 | message GetJobResponse {
508 |   // The job to be scheduled.
509 |   Job job = 1;
510 | }
511 | 
512 | // DeleteJobRequest is the message used by the daprd sidecar to delete or get a job.
513 | message DeleteJobRequest {
514 |   string name = 1;
515 | 
516 |   // The metadata associated with the job.
517 |   JobMetadata metadata = 2;
518 | }
519 | 
520 | message DeleteJobResponse {
521 |   // Empty as of now
522 | }
523 | ```
524 | 
525 | To allow for the triggered job to be sent back to any instance of the same app id that scheduled the job, we will add:
526 | ```proto
527 | // AppCallback allows user application to interact with Dapr runtime.
528 | // User application needs to implement AppCallback service if it needs to
529 | // receive message from dapr runtime.
530 | service AppCallback {
531 | ...
532 |   // Sends job back to the app's endpoint at trigger time.
533 |   rpc OnJobEvent (JobEventRequest) returns (JobEventResponse);
534 | }
535 | 
536 | message JobEventRequest {
537 |   // Job name.
538 |   string name = 1;
539 | 
540 |   // Job data to be sent back to app.
541 |   google.protobuf.Any data = 2;
542 | 
543 |   // Required. method is a method name which will be invoked by caller.
544 |   string method = 3;
545 | 
546 |   // The type of data content.
547 |   //
548 |   // This field is required if data delivers http request body
549 |   // Otherwise, this is optional.
550 |   string content_type = 4;
551 | 
552 |   // HTTP specific fields if request conveys http-compatible request.
553 |   //
554 |   // This field is required for http-compatible request. Otherwise,
555 |   // this field is optional.
556 |   common.v1.HTTPExtension http_extension = 5;
557 | }
558 | 
559 | // JobEventResponse is the response from the app when a job is triggered.
560 | message JobEventResponse {}
561 | ```
562 | 
563 | ### Scheduler Service
564 | 
565 | ![Pluggable Scheduler Service](./resources/0004-BIRS-distributed-scheduler/pluggableSchedulerService.png)
566 | 
567 | A new `Scheduler Service` is created in the control plane. This Scheduler Service will include an embedded etcd instance (persisted) as well as a Scheduler Dapr API, which will live in `dapr/proto/scheduler/scheduler.proto`. The Scheduler Service is pluggable and allows for different implementations as needed. It is installed by default into the local development environment on `dapr init` similar to other control plane services in Dapr. This is an optional service and runs in a local container.
568 | 
569 | To guarantee we don't have several Scheduler Service instances firing off the same job, we will have **virtual partitioning** (in-memory) such that each Scheduler Service instance owns a subset of all the jobs that exist. 
570 | 
571 | ### CLI
572 | 
573 | `dapr job schedule --name=<name> -–schedule=“@hourly” -–data=“<data>”`
574 | 
575 | ### Implications
576 | 
577 | - The Scheduler Building Block and Service will result in the ***deprecation*** of `actor reminders` and the `bindings.cron` component.
578 | 
579 | ## Expectations and alternatives
580 | 
581 | * What is in scope for this proposal?
582 |     * To start, this will implement generic scheduler logic, then will be expanded to enable the:
583 |         * Delayed PubSub
584 |         * Scheduled Service Invocation
585 |         * Actor Reminders
586 | * What alternatives have been considered, and why do they not solve the problem?
587 |     * Placement with Actor reminders is *very* Actor specific. The goal is to have a new scheduler in Dapr that is reusable by several building blocks.
588 | * What advantages does this proposal have? 
589 |     * This design has the advantage of enabling the Scheduler Service to be implemented in different ways. 
590 |     * This design also enables flexibility in where the data is stored. Whether that is in etcd in full, or partially with a reference to a state store (component) of the user's choice.
591 | 
592 | ### Acceptance Criteria
593 | 
594 | * How will success be measured? 
595 |     * POCs will be done to guarantee the optimal performance solution for the Scheduler Service, testing with 3 & 5 Scheduler Service instances
596 |         * minimum RPS for registering reminders
597 |         * minimum RPS for triggers to app
598 |         * maximum number of reminders
599 |         * have a backup and recovery scenario of when Scheduler cluster permanently fails 
600 |         
601 | ## Completion Checklist
602 | 
603 | What changes or actions are required to make this proposal complete? Some examples:
604 | 
605 | * Scheduler Building Block API code
606 | * Scheduler Service code
607 | * Tests added (e2e, unit)
608 | * SDK changes
609 | * Documentation
610 | 
611 | 


--------------------------------------------------------------------------------
/0013-RS-pubsub-subscription-streaming.md:
--------------------------------------------------------------------------------
  1 | # PubSub Subscription Streaming
  2 | 
  3 | * Author(s): @joshvanl
  4 | * State: Ready for Implementation
  5 | * Updated: 2024-03-05
  6 | 
  7 | ## Overview
  8 | 
  9 | This is a design proposal to implement a new Dapr runtime gRPC and HTTP API for subscription streaming.
 10 | This new gRPC & HTTP API will allow an application to subscribe to a PubSub topic and receive messages through this RPC.
 11 | Applications will be able to dynamically subscribe and unsubscribe to topics, and receive messages without opening a port to receive incoming traffic from Dapr.
 12 | 
 13 | ## Background
 14 | 
 15 | Dapr supports applications subscribing to PubSub topic events.
 16 | These subscriptions can be configured either:
 17 | - `programmatically` by returning the subscription config on the app channel server on app health ready, or
 18 | - `declaratively` via Subscription yaml manifests in Self-Hosted or Kubernetes mode.
 19 | 
 20 | Today, it is not possible to dynamically update the subscription list without restarting Daprd, though hot reloading for Subscription manifests is [planned](https://github.com/dapr/dapr/issues/7139).
 21 | It is common for users to want to dynamically subscribe and unsubscribe to topics inside their applications based on runtime conditions.
 22 | In the cases where Dapr is not running as a sidecar, users often do not want to open a public port or create a tunnel in order to receive PubSub messages from Dapr.
 23 | 
 24 | A streaming Subscription API will allow applications to dynamically subscribe to PubSub topics and receive messages without opening a port to receive incoming traffic from Dapr.
 25 | 
 26 | ## Expectations and alternatives
 27 | 
 28 | This proposal outlines the gRPC & HTTP streaming API for subscribing to PubSub topics.
 29 | This proposal does _not_ address any hot-reloading functionality to the existing programmatic or declarative subscription configuration.
 30 | Using a gRPC streaming API is the most natural fit for this feature, as it allows for first class long-lived bi-directional connections to Dapr to receive messages.
 31 | A supplementary WebSocket based HTTP API is useful for applications which do not have a gRPC client available or HTTP WebSockets are preferred.
 32 | These messages are typed RPC giving the best UX in each SDK.
 33 | Once implemented, this feature will need to be implemented in all Dapr SDKs.
 34 | 
 35 | ## Solution
 36 | 
 37 | ### gRPC
 38 | 
 39 | Rough gRPC PoC implementation: https://github.com/dapr/dapr/commit/ed40c95d11b78ab9a36a4a8f755cf89336ae5a05
 40 | 
 41 | The Dapr runtime gRPC server will implement the following new RPC and messages:
 42 | 
 43 | ```proto
 44 | service Dapr {
 45 |   // SubscribeTopicEventsAlpha1 subscribes to a PubSub topic and receives topic events
 46 |   // from it.
 47 |   rpc SubscribeTopicEventsAlpha1(stream SubscribeTopicEventsRequestAlpha1) returns (stream TopicEventRequestAlpha1) {}
 48 | }
 49 | 
 50 | // SubscribeTopicEventsRequest is a message containing the details for
 51 | // subscribing to a topic via streaming.
 52 | // The first message must always be the initial request. All subsequent
 53 | // messages must be event responses.
 54 | message SubscribeTopicEventsRequestAlpha1 {
 55 |   oneof subscribe_topic_events_request_type {
 56 |     SubscribeTopicEventsSubscribeRequestAlpha1 request = 1;
 57 |     SubscribeTopicEventsResponseAlpha1 event_response = 2;
 58 |   }
 59 | }
 60 | 
 61 | // SubscribeTopicEventsSubscribeRequest is the initial message containing the
 62 | // details for subscribing to a topic via streaming.
 63 | message SubscribeTopicEventsSubscribeRequestAlpha1 {
 64 |   // The name of the pubsub component
 65 |   string pubsub_name = 1 [json_name = "pubsubName"];
 66 | 
 67 |   // The pubsub topic
 68 |   string topic = 2 [json_name = "topic"];
 69 | 
 70 |   // The metadata passing to pub components
 71 |   //
 72 |   // metadata property:
 73 |   // - key : the key of the message.
 74 |   optional map<string, string> metadata = 3 [json_name = "metadata"];
 75 | 
 76 |   // dead_letter_topic is the topic to which messages that fail to be processed
 77 |   // are sent.
 78 |   optional string dead_letter_topic = 4 [json_name = "deadLetterTopic"];
 79 | 
 80 |   // max_in_flight_messages is the maximum number of in-flight messages that
 81 |   // can be processed by the subscriber at any given time.
 82 |   // Default is no limit.
 83 |   optional max_in_flight_messages = 5 [json_name = "maxInFlightMessages"];
 84 | }
 85 | 
 86 | // SubscribeTopicEventsResponse is a message containing the result of a
 87 | // subscription to a topic.
 88 | message SubscribeTopicEventsResponseAlpha1 {
 89 |   // id is the unique identifier for the subscription request.
 90 |   string id = 1 [json_name = "id"];
 91 | 
 92 |   // status is the result of the subscription request.
 93 |   TopicEventResponseAlpha1 status = 2 [json_name = "status"];
 94 | }
 95 | ```
 96 | 
 97 | When an application wishes to subscribe to a topic it will initiate a stream with `SubscribeTopicEventsRequest`, and `Send` the initial request `SubscribeTopicEventsSubscribeRequest` containing the options for the subscription.
 98 | Daprd will then setup the machinery to add this gRPC RPC stream to the set of subscribers.
 99 | The request contains no route or path matching configuration as all events will be sent on this stream.
100 | Subscription gRPC streams are the highest priority when Daprd determines which publisher a message should be sent to.
101 | Only a single PubSub Topic pair may be subscribed at a single time with this API.
102 | If the first message sent to the server is not the initial request, the RPC will return an error.
103 | If any subsequent messages are not `SubscribeTopicEventsResponse` messages, the RPC will return an error.
104 | 
105 | When a message is published to the topic, Daprd will send a `TopicEventRequest` message on the stream containing the message payload and metadata.
106 | After the application has processed the message, it will send to the server a `SubscribeTopicEventsResponse` containing the `id` of the message and the `status` of the message processing.
107 | Since multiple messages can be sent and processed in the application at the same time, the event `id` is used by the server to track the status of each individual event.
108 | An event topic response will follow the timeout resiliency as currently exist for subscriptions.
109 | 
110 | Client code:
111 | 
112 | ```go
113 | 	stream, _ := client.SubscribeTopicEventsAlpha1(ctx)
114 | 	stream.Send(&rtv1.SubscribeTopicEventsRequestAlpha1{
115 | 		SubscribeTopicEventsRequestTypeAlpha1: &rtv1.SubscribeTopicEventsRequest_RequestAlpha1{
116 | 			Request: &rtv1.SubscribeTopicEventsSubscribeRequestAlpha1{
117 | 				PubsubName: "mypub", Topic: "a",
118 | 			},
119 | 		},
120 | 	})
121 | 
122 | 	client.PublishEvent(ctx, &rtv1.PublishEventRequest{
123 | 		PubsubName: "mypub", Topic: "a",
124 | 		Data:            []byte(`{"status": "completed"}`),
125 | 		DataContentType: "application/json",
126 | 	})
127 | 
128 | 	event, _ := stream.Recv()
129 | 	stream.Send(&rtv1.SubscribeTopicEventsRequestAlpha1{
130 | 		SubscribeTopicEventsRequestType: &rtv1.SubscribeTopicEventsRequest_EventResponseAlpha1{
131 | 			EventResponse: &rtv1.SubscribeTopicEventsResponseAlpha1{
132 | 				Id:     event.Id,
133 | 				Status: &rtv1.TopicEventResponse{Status: rtv1.TopicEventResponse_SUCCESS},
134 | 			},
135 | 		},
136 | 	})
137 | 
138 | 	stream.CloseSend()
139 | ```
140 | 
141 | ### HTTP (WebSockets)
142 | 
143 | Along with a gRPC based streaming API, a WebSocket based HTTP equivalent API will be implemented.
144 | Much like the gRPC API, the HTTP based WebSocket API will follow an initial request-response handshake, followed by a stream of messages to the client with status responses by the client, indexed by the message ID.
145 | The same proto types as using in the gRPC API (but in JSON blobs) will be used for the HTTP API.
146 | The server WebSocket implementation will be based on the [gorilla/websocket](https://github.com/gorilla/websocket) package, as this seems well used, understood and maintained.
147 | 
148 | The HTTP streaming API will be available at the following endpoint.
149 | As the pubsub and topic information is in the request body, no request configuration is given in the URL.
150 | 
151 | ```
152 | GET: /v1.0-alpha1/subscribe
153 | ```
154 | 
155 | ```json
156 | INITIAL_REQUEST (to server) = {
157 |   "pubsubName": "mypub",
158 |   "topic": "a",
159 |   "metadata": {
160 |     "key": "value"
161 |   },
162 |   "deadLetterTopic": "dead-letter-topic",
163 |   "maxInFlightMessages": 10
164 | }
165 | 
166 | TOPIC_EVENT_REQUEST (to application) = {
167 |   "id": "123",
168 |   "source": "asource",
169 |   "type": "atype",
170 |   "spec_version": "1.0",
171 |   "data_content_type": "application/json",
172 |   "data": "abc",
173 |   "topic": "a",
174 |   "pubsub_name": "mypub",
175 |   "path": "/"
176 | }
177 | 
178 | TOPIC_EVENT_RESPONSE (to server) = {
179 |   "id": "123",
180 |   "status": {
181 |     "status": "SUCCESS"
182 |   }
183 | }
184 | ```
185 | 
186 | ## Completion Checklist
187 | 
188 | - [ ] gRPC server implementation in daprd
189 | - [ ] API documentation
190 | - [ ] SDK implementations
191 |   - [ ] .Net
192 |   - [ ] Java
193 |   - [ ] Go
194 |   - [ ] Python
195 |   - [ ] JavaScript
196 | 


--------------------------------------------------------------------------------
/20221025-P-proposal-process.md:
--------------------------------------------------------------------------------
 1 | # Improving the Dapr proposal process
 2 | 
 3 | * Author(s): John Ewart (@johnewart), Mukundan Sundararajan (@mukundansundar)
 4 | * State: Approved
 5 | * Date: 10/25/2022
 6 | 
 7 | ## Overview
 8 | 
 9 | This proposal is to formalize the structure and lifecycle to proposals with three primary goals: first, make it easier for contributors to both put forth proposals as well as review them, second, increase the clarity and focus of proposals themselves, and, third, provide guidance on what is expected for a well defined feature to be "dev complete". In order to do this, we should:
10 | 
11 | 1. Create a template for new proposals.
12 | 2. Define the core requirements for a feature that is being proposed to be considered complete.
13 | 3. Implement a process for reviewing and accepting new proposals.
14 | 
15 | ## Background
16 | 
17 | ### Why define a proposal process and templatize it?
18 |  
19 | As a community project, Dapr relies on contributors to help advance the project the goal of this proposal is to simplify the process of contributing, and evaluating, new ideas. We want to make it more inviting for community members to propose new ideas (or evaluate them) as well as ensure that the time being spent evaluating proposals or working on new features is well spent.
20 | 
21 | Adding clarity to the proposal process, as well as some amount of structure, will hopefully make it easier for contributors of all experience levels to both contribute and review new ideas. As a new contributor it can sometimes feel a bit daunting to propose a new idea -- not knowing quite where to start, whether or not what you are proposing has the right level of information, etc. Having structure makes it easier for a new contributor who wants to propose a new idea to know what is expected and feel confident that their proposal meets those expectations. In addition, for anyone putting forward a proposal, the structure proposed prompts thinking about how someone else would use the feature, how they might benefit from it, and what other ways the feature they are proposing might be solved using existing features (or other technology).  
22 | 
23 | On the other side of the equation are the community members who are reviewing those proposals; it can be challenging to review something if you feel that information or context is lacking. A consistent structure means that reviewers can know what to expect out of a proposal document and clearly ask for more information if some is missing. And, the suggested structure would make sure that reviewers have the right information needed to have a conversation about the proposal (as well as reduce the scope of the review). 
24 | 
25 | As a community, we want to be welcoming of new people and also respectful of the time and energy that everyone devotes to make this project great. I believe that adding this small amount of structure to the proposal process will help not only make it easier to propose new ideas, but also ensure that everyone who is participating can make the best use of the time they have available to improve Dapr!
26 | 
27 | ### Why define minimum requirements for a feature to be complete?
28 | 
29 | As Dapr increases in scope and brinds on more contributions, it is important that we define what we expect before a new feature is added to Dapr. In order to assure that all aspects of the feature have been completed for release we need to provide clear guidance on what needs to be accomplished before it is accepted into Dapr. This will help us to qualify these features as complete for a particular release milestone and be confident in what is being released. For example, most features would require, at a minumum:
30 | 
31 | * Completion of the code
32 | * Maintainer signoff on the implementation by the feature freeze date
33 | * Code merged into the main branch well before the code freeze date (feature freeze date at the latest) 
34 | * During the time between feature freeze and code freeze, any P0 regressions/bugs related to this feature that are identified need to be fixed.
35 | * Adding / updating performance tests, e2e tests etc.
36 | * Documentation for the new feature has been committed to `dapr/docs` 
37 | * Creating / updating quick starts, tutorials (if relevant)
38 | 
39 | In addition, some features would require changes to SDKs or have additional requirements in order for it to be considered fully complete, and so those requirements should also be tracked in order to ensure completion.
40 | 
41 | 
42 | ## Background vs Design
43 | 
44 | > "Your scientists were so preoccupied with whether they could, they didn’t stop to think if they should." - Dr. Ian Malcolm (Jurassic Park)
45 | 
46 | The intent of the proposal is to focus on three primary areas: 
47 | 
48 | * _What_ the proposal is putting forth
49 | * _Why_ the proposal is required (what will it do for users?)
50 | * _How_ the proposal will work 
51 | 
52 | 
53 | ### A bit about the _why?_
54 | 
55 | In order to be effective, a proposal must provide both the background on the idea: what it is and how it works (at a high level) as well as _why it should be introduced to Dapr_. The why part of this proposal is just as important (if not more so) than what is being proposed, as it lets the reviewers understand better what kinds of use cases that the feature shall enable, how it shall make Dapr better or improve the experience of Dapr users. By going through and clarifying _why_ something should be added to Dapr, it forces us as developers to think carefully about what we are taking on as a community and how it will impact others - the benefits and potential drawbacks that it might bring.
56 | 
57 | In addition, it must also convey what is in scope and what is out of scope (i.e what things have been deliberately omitted) along with any alternatives that have been considered and why they were not a good fit.
58 | 
59 | ### Design
60 | 
61 | The second half of the process focuses on the implementation of the proposal - the goal of this part is to show the community not only how it will operate, but also provide information on how success shall be measured and also include a list of activities that must be completed in order for this proposal to be complete. 
62 | 
63 | ## Proposed templates for Design and Build Phases
64 | 
65 | See the following file: [templates/proposal.md](templates/proposal.md) 
66 | 
67 | ## Related Items
68 | 
69 | ### Related proposals 
70 | 
71 | N/A
72 | 
73 | ### Related issues 
74 | 
75 | N/A
76 | 
77 | ## Expectations and alternatives
78 | 
79 | ### What is in scope for this proposal?
80 | 
81 | This proposal covers the process for creating, storing, and reviewing proposals. The intention is to improve the process and increase clarity around proposals, their status, and their design. 
82 | 
83 | ### What is deliberately *not* in scope?
84 | 
85 | The planning process for including proposals in a given release is not a part of this proposal, the assumption is that process will continue to operate as it currently does. 
86 | 
87 | ### What advantages / disadvantages does this proposal have? 
88 | 
89 | This proposal has the advantage of increasing clarity of proposals as well as implicitly creating a record of design decisions; however, it is a little more involved and structured than the previous process, which may be viewed as a disadvantage. The authors of this proposal believe that the advantages significantly outweigh any potential disadvantages, however. 
90 | 
91 | ## Implementation Details
92 | 
93 | ### Completion Checklist
94 | 
95 | - [ ] A new repository (`dapr/proposals`) is created as a copy of this repository
96 | - [ ] Any relevant documentation around submitting proposals or the development process point to this new repository and process
97 | - [ ] Migration of existing _in-flight_ proposals from GitHub issues to this repository
98 | - [ ] _(Optional)_ Migration of previous proposals to this repository
99 | 


--------------------------------------------------------------------------------
/20221121-R-pluggable-components-injector.md:
--------------------------------------------------------------------------------
  1 | # Pluggable components injector
  2 | 
  3 | - Author(s): Marcos Candeia (@mcandeia)
  4 | - State: Ready for Implementation
  5 | - Updated: 11/21/2022
  6 | 
  7 | ## Overview
  8 | 
  9 | Pluggable components are components that are not included as part of the runtime, as opposed to built-in ones that are included. The major difference between pluggable components and built-in components is the operational burden related to bootstrap/start the pluggable component process that are not necessary when using a built-in one since they run in the same process as Dapr runtime. This operational burden is present in many ways when using pluggable components and can lead to errors and hard debugging. In addition, there are certain configurations that are tied to the Dapr and how the runtime registers the pluggable component that is repetitive and can be better handled by Dapr instead of delegating this responsibility to the end-user. This proposal suggest the addition of a new mode of execution for selected pluggable components: injectable pluggable components.
 10 | 
 11 | ## Background
 12 | 
 13 | #### Decrease the operational burden
 14 | 
 15 | Even considering the new pluggable components annotation from [#5402](https://github.com/dapr/dapr/issues/5402), setting up applications to properly work with pluggable components still not an easy task due to the operational related to bootstrapping containers over and over again for each application that the user needs, especially if you consider that components are often not well [scoped](https://docs.dapr.io/operations/components/component-scopes/). Without scope, a component make itself available for all applications within the same namespace, meaning that every deployment/pod should re-do the same manual job of mounting volumes, declaring environment variables and pinning container images.
 16 | 
 17 | So let's say you have an application named `my-app` and, another one named `my-app-2`, your two deployments/pods will look like the following:
 18 | 
 19 | ```yaml
 20 | apiVersion: apps/v1
 21 | kind: Deployment
 22 | metadata:
 23 | name: app
 24 | labels:
 25 |   app: app
 26 | spec:
 27 | replicas: 1
 28 | selector:
 29 |   matchLabels:
 30 |     app: app
 31 | template:
 32 |   metadata:
 33 |     labels:
 34 |       app: app
 35 |     annotations:
 36 |       dapr.io/pluggable-components: "component"
 37 |       dapr.io/app-id: "my-app"
 38 |       dapr.io/enabled: "true"
 39 |   spec:
 40 |     volumes:
 41 |       - name: my-component-required-volume
 42 |         emptyDir: {}
 43 |     containers:
 44 |       - name: my-app
 45 |         image: my-app-image:latest
 46 |       ### This is the pluggable component container.
 47 |       - name: component
 48 |         image: component:v1.0.0
 49 |         volumes:
 50 |           - name: my-component-required-volume
 51 |             mountPath: "/my-data"
 52 |         env:
 53 |           - name: MY_ENV_VAR_NAME
 54 |             value: MY_ENV_VAR_VALUE
 55 | 
 56 | ---
 57 | apiVersion: apps/v1
 58 | kind: Deployment
 59 | metadata:
 60 | name: app-2
 61 | labels:
 62 |   app: app-2
 63 | spec:
 64 | replicas: 1
 65 | selector:
 66 |   matchLabels:
 67 |     app: app-2
 68 | template:
 69 |   metadata:
 70 |     labels:
 71 |       app: app-2
 72 |     annotations:
 73 |       dapr.io/pluggable-components: "component"
 74 |       dapr.io/app-id: "my-app-2"
 75 |       dapr.io/enabled: "true"
 76 |   spec:
 77 |     volumes:
 78 |       - name: my-component-required-volume
 79 |         emptyDir: {}
 80 |     containers:
 81 |       - name: my-app-2
 82 |         image: my-app-2-image:latest
 83 |       ### This is the pluggable component container.
 84 |       - name: component
 85 |         image: component:v1.0.0
 86 |         volumes:
 87 |           - name: my-component-required-volume
 88 |             mountPath: "/my-data"
 89 |         env:
 90 |           - name: MY_ENV_VAR_NAME
 91 |             value: MY_ENV_VAR_VALUE
 92 | ```
 93 | 
 94 | Notice that everything related to the pluggable component container is repeated, and if you have a third application that doesn't require your pluggable component to work, so you have to scope your component to be initialized with only these two declared deployments/pods.
 95 | 
 96 | ```yaml
 97 | apiVersion: dapr.io/v1alpha1
 98 | kind: Component
 99 | metadata:
100 |   name: my-component
101 | spec:
102 |   type: state.my-component
103 |   version: v1
104 |   metadata: []
105 | scopes:
106 |   - "my-app"
107 |   - "my-app-2"
108 | ```
109 | 
110 | For each deployment that you have to add in your cluster, if that requires such pluggable component, you must also add in the scope list of the component spec, which ends up being error prone and intrusive.
111 | 
112 | #### Component spec atomicity/self-contained
113 | 
114 | Allow interchangeable/swappable components are one of the top amazing features that we provide, with that, a user can, in runtime, swap out a component with the same interface for another. Pluggable components made this behavior more difficult to maintain as it requires coordination, for a small time window, the user must provide a way to Dapr access both components at same time, otherwise it becomes very difficult to orchestrate that change manually.
115 | To exemplify, suppose that we want to replace the Redis PubSub with the Kafka PubSub, and they are pluggable components. This is not only a matter of replacing the component spec itself, but it will require orchestrating the related deployments, otherwise it would lead in having an application pointing out to Kafka but with no Kafka pluggable component running and vice-versa.
116 | 
117 | The following diagram is exemplifying how that orchestrated change must applied:
118 | 
119 | <img width="466" alt="image" src="https://user-images.githubusercontent.com/5839364/201184828-d4e7357b-716a-4a3b-b7a5-dd22d1be7cda.png">
120 | 
121 | > That can't be avoided in scenarios where Dapr is not present as an orchestrator, for instance, self-hosted mode, but there are platforms that supports extensibility for orchestrating applications and its dependencies, like Kubernetes.
122 | 
123 | re: You can argue that Kubernetes solve this scenario by reconciling the cluster state until it succeeds, but still, it severely degrade the user experience when requires additional knowledge to build their applications with Dapr.
124 | 
125 | ## Related Items
126 | 
127 | ### Related proposals
128 | 
129 | [Pluggable components Annotations](https://github.com/dapr/dapr/issues/5402)
130 | 
131 | ### Related issues
132 | 
133 | N/A
134 | 
135 | ## Expectations and alternatives
136 | 
137 | ### What is in scope for this proposal?
138 | 
139 | This proposal aims to add a new execution mode for pluggable components, the dapr-injected pluggable components.
140 | 
141 | ### What is deliberately _not_ in scope?
142 | 
143 | This proposal does not aims to manage users' pluggable components code. The goal here is to provide a better UX when using pluggable components while decrease the operation burden.
144 | 
145 | ## Implementation Details
146 | 
147 | ### Design
148 | 
149 | This proposal aims to add a new execution mode for pluggable components, the dapr-injected pluggable components, that makes the operational behind remarkable like the built-in components. The operational burden is still present somewhere but divided into small reusable pieces.
150 | 
151 | <meta charset="utf-8"><b style="font-weight:normal;" id="docs-internal-guid-69570229-7fff-318a-575c-cff928d2ef5b"><p dir="ltr" style="line-height:1.38;background-color:#ffffff;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">&nbsp;</span></p><div dir="ltr" style="margin-left:0pt;" align="left">
152 | 
153 | | Type              | Injected by Dapr                                                                         | Managed by User/Unmanaged                                                                                           |
154 | | ----------------- | ---------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------- |
155 | | Configuration     | Dapr Injects env vars and mount the shared volumes                                       | The user manually mounts and declares shared volumes                                                                |
156 | | Container updates | Dapr automatically detects and applies, rolling out changes based on declared components | Users must redeploy their applications with the new desired version                                                 |
157 | | Persona           | Cluster operator/End user                                                                | End user                                                                                                            |
158 | | Scope             | Does not need to be scoped                                                               | If not scoped, all applications should have deployed the pluggable component, otherwise runtime errors might happen |
159 | 
160 | </div></b>
161 | 
162 | #### Component spec annotations
163 | 
164 | The component spec is still the entry point for all component types being pluggable or not, given that the pluggable components are a subset of all users declared components, even more, the pluggable components can be inferred from the declared components, we can actually leverage that property to extend our component spec, by adding custom annotations to allow Dapr to inject the component container at the time the Injector is also injecting the Dapr sidecar container.
165 | 
166 | Example:
167 | 
168 | ```yaml
169 | apiVersion: dapr.io/v1alpha1
170 | kind: Component
171 | metadata:
172 |   name: my-component
173 |   annotations:
174 |     dapr.io/component-container-image: "component:v1.0.0"
175 | spec:
176 |   type: state.my-componentƒ
177 |   version: v1
178 |   metadata: []
179 | ```
180 | 
181 | Optionally you can mount volumes and add env variables into the containers by using the `dapr.io/component-container-volume-mounts(-rw)` and `dapr.io/component-container-env` annotations.
182 | 
183 | ```yaml
184 | apiVersion: dapr.io/v1alpha1
185 | kind: Component
186 | metadata:
187 |   name: my-component
188 |   annotations:
189 |     dapr.io/component-container-image: "component:v1.0.0"
190 |     dapr.io/component-container-volume-mounts: "volume-name:/volume-path,volume-name-2:/volume-path-2" # read-only, "$VOLUME_NAME:$VOLUME_PATH,$VOLUME_NAME_2:$VOLUME_PATH2"
191 |     dapr.io/component-container-volume-mounts-rw: "volume-name-rw:/volume-path-rw,volume-name-2-rw:/volume-path-2-rw" # read-write "$VOLUME_NAME:$VOLUME_PATH,$VOLUME_NAME_2:$VOLUME_PATH2"
192 |     dapr.io/component-container-env: "env-var=env-var-value,env-var-2=env-var-value-2" #optional "$ENV_NAME=$ENV_VALUE,$ENV_NAME_2=$ENV_VALUE_2"
193 | spec:
194 |   type: state.my-component
195 |   version: v1
196 |   metadata: []
197 | ```
198 | 
199 | By default the injector creates undeclared volumes as `emptyDir` volumes, if you want a different volume type you should declare it by yourself in your pods.
200 | 
201 | #### Pod annotations
202 | 
203 | In order to allow users to turn off the component injector for their pod, a new annotation will be available, similar to the one that we have for enabling dapr: `dapr.io/inject-pluggable-components:"true"`. Let's rewrite the previous examples using the injected pluggable components feature, it would be something like:
204 | 
205 | The apps deployments/pods:
206 | 
207 | ```yaml
208 | apiVersion: apps/v1
209 | kind: Deployment
210 | metadata:
211 | name: app
212 | labels:
213 |   app: app
214 | spec:
215 | replicas: 1
216 | selector:
217 |   matchLabels:
218 |     app: app
219 | template:
220 |   metadata:
221 |     labels:
222 |       app: app
223 |     annotations:
224 |       dapr.io/inject-pluggable-components: "true"
225 |       dapr.io/app-id: "my-app"
226 |       dapr.io/enabled: "true"
227 |   spec:
228 |     containers:
229 |       - name: my-app
230 |         image: my-app-image:latest
231 | ---
232 | apiVersion: apps/v1
233 | kind: Deployment
234 | metadata:
235 | name: app-2
236 | labels:
237 |   app: app-2
238 | spec:
239 | replicas: 1
240 | selector:
241 |   matchLabels:
242 |     app: app-2
243 | template:
244 |   metadata:
245 |     labels:
246 |       app: app-2
247 |     annotations:
248 |       dapr.io/inject-pluggable-components: "true"
249 |       dapr.io/app-id: "my-app-2"
250 |       dapr.io/enabled: "true"
251 |   spec:
252 |     containers:
253 |       - name: my-app-2
254 |         image: my-app-2-image:latest
255 | ```
256 | 
257 | And the component spec:
258 | 
259 | ```yaml
260 | apiVersion: dapr.io/v1alpha1
261 | kind: Component
262 | metadata:
263 |   name: my-component
264 |   annotations:
265 |     dapr.io/component-container-image: "component:v1.0.0"
266 |     dapr.io/component-container-volume-mounts: "my-component-required-volume;/my-data"
267 |     dapr.io/component-container-env: "MY_ENV_VAR_NAME;MY_ENV_VAR_VALUE"
268 | spec:
269 |   type: state.my-component
270 |   version: v1
271 |   metadata: []
272 | ```
273 | 
274 | ### Feature lifecycle outline
275 | 
276 | #### Expectations
277 | 
278 | The feature is expected to be delivered as part of dapr/dapr v1.10.0 as a preview feature together with the new pluggable components SDK.
279 | 
280 | #### Compatability guarantees
281 | 
282 | Pluggable components that has been used will not be affected by this.
283 | 
284 | #### Deprecation / co-existence with existing functionality
285 | 
286 | N/A
287 | 
288 | ### Acceptance Criteria
289 | 
290 | N/A
291 | 
292 | ## Completion Checklist
293 | 
294 | What changes or actions are required to make this proposal complete? Some examples:
295 | 
296 | - [] Change the sidecar injector to make requests to the operator for listing components (or list it using its own role)
297 | - [] Add 1 more annotation for pods `dapr.io/inject-pluggable-components: "true"` and 3 more for components `dapr.io/component-container-image`, `dapr.io/component-container-env` and `dapr.io/component-container-volume-mounts`
298 | - [] Add the components container injector based on declared components
299 | 


--------------------------------------------------------------------------------
/20221130-I-enhance-dapr-run-multiple-apps.md:
--------------------------------------------------------------------------------
  1 | # Run multiple applications with Dapr sidecars
  2 | 
  3 | Author(s): Mukundan Sundararajan
  4 | 
  5 | State: Ready for implementation
  6 | 
  7 | Updated: 30th Nov 2022
  8 | 
  9 | ## Overview
 10 | 
 11 | This is a proposal for feature to be included in the dapr CLI which allows easy way to start multiple services that needs to be run in tandem along with their `daprd` sidecars in local self-hosted mode.
 12 | 
 13 | ## Background
 14 | 
 15 | Currently to run multiple services along with `dapr sidecar` locally, users need to run multiple `dapr run` commands, keep track of all ports opened, the components folders that each service refers to, the config file each service refers to etc.
 16 | There are also multiple other flags that can be used to tweak behavior of `dapr run` command eg: `--unix-domain-socket`, `--dapr-internal-grpc-port`, `--app-health-check-path` etc.
 17 | 
 18 | This increases the complexity of using dapr in development, where users want to run multiple services in local mode and be able to partially/fully replicate the production sceanrio.
 19 | 
 20 | In K8s mode this is alleviated through the use of helm/deployment YAML files. There is currently no such capability available for local self hosted mode.
 21 | 
 22 | Asking a user to run multiple different `dapr run` commands each with different flags, increases the complxity for users onboarding onto Dapr.
 23 | 
 24 | ### Why dapr CLI?
 25 | 
 26 | From the initial [proposal](https://github.com/dapr/community/issues/207), the solution was proposed as a seprate repo and CLI in itself. But later it was suggested to use dapr CLI itself to have a `compose` command in it instead of a having a separate CLI.
 27 | The main reason for including it in the dapr CLI itself is that, users do not have to download and use a new CLI in addition to the dapr CLI.
 28 | 
 29 | This feature is tightly coupled and opinionated on how `dapr` is going to be run locally, and having a separate CLI `dapr-compose` deciding on how `dapr` CLI should be used, is not a good pattern to start with.
 30 | 
 31 | > Note: `daprd` is more generally used and considered a binary and not necessarily a CLI tool. So `dapr` CLI is not making use of another CLI but rather passing on configs for running a binary.
 32 | 
 33 | ## Related Items
 34 | 
 35 | ### Related Proposals
 36 | - https://github.com/dapr/community/issues/207
 37 | - https://github.com/dapr/cli/issues/1123
 38 | 
 39 | ### Related Issues
 40 | 
 41 | ## Expectations and alternatives
 42 | 
 43 | The scope of this proposal is to enhance the `run` CLI command which allowing users to define and run multiple services from a single run configuration file.
 44 | 
 45 | This proposal specifically targets running in local environments and `slim` mode, where container engines are available. For running `daprd` container along with `app` container the solution is to use Kubernetes or docker-compose.
 46 | 
 47 | For this proposal we will targetting runnning the applications and side cars as processes in the OS.
 48 | 
 49 | > Note: All other commands in `dapr` CLI for self hosted mode are written to work with processes
 50 | 
 51 | ## Requirements
 52 | 
 53 | The main requirements for the command:
 54 | - being able to configure multiple dapr apps from a single configuration file
 55 | - users should be able to use normal `dapr` CLI commands for self hosted mode against any apps that are started through `dapr compose`
 56 | 
 57 | Additional requirement for this feature is to come up with conventions on how to organize/run Dapr projects locally.
 58 | 
 59 | ## Proposed Structure for organizing Dapr projects locally
 60 | 
 61 | Currently `dapr` CLI initializes in the home directory(user profile dir for windows) a folder called `.dapr` and the default configurations and resources to be used are stored there.
 62 | 
 63 | Users developing different apps using dapr will have different resources/configurations that they use per application. Each time the user has to run the application with a particular config and resources directory value, they have to override the flag for `dapr run` command.
 64 | 
 65 | Instead the following convention is proposed for loading the resources/config for an application.
 66 | The command expects the following directory structure:
 67 | ```
 68 | .dapr/
 69 |   |____ config.yaml
 70 |   |
 71 |   |____ resources/
 72 |           |
 73 |           |____ statestore.yaml
 74 |           |____ pubsub.yaml
 75 |           |____ resiliency_conf.yaml
 76 |           |____ subscription.yaml
 77 | ```
 78 | In each app directory,there can be a `.dapr` folder, which contains a `resources` directory and a `config.yaml` file. If that directory is not present, the default locations is used which are `~/.dapr/resources/` and `~/.dapr/config.yaml` (`%USERPROFILE%` instead of `~` for windows).
 79 | 
 80 | > Note: This change will be made in `dapr run` only when the newly introduced `-f` flag is used. See [below](#precedence-rules) for details on which folder content will take preceedence when a run configuration is given as input.
 81 | 
 82 | > Note: This change does not impact the `bin` folder where `dapr` CLI looks for the `daprd` and `dashboard` binaries. That will still remain the same `~/.dapr/bin/` (%USERPROFILE% for windows).
 83 | 
 84 | ## Proposed Structure for run configuration file
 85 | 
 86 | > Expected default file name is `dapr.yaml`
 87 | 
 88 | ```yaml
 89 | version: 1
 90 | common:
 91 |   resources_dir: ./app/components # any dapr resources to be shared across apps
 92 |   env:  # any environment variable shared among apps
 93 |     - DEBUG: true
 94 | apps:
 95 |   - app_id: webapp
 96 |     app_dir: ./webapp/
 97 |     resources_dir: ./webapp/components # (optional) can be default by convention too, ignore if dir is not found.
 98 |     config_file: ./webapp/config.yaml # (optional) can be default by convention too, ignore if file is not found.
 99 |     app_protocol: HTTP
100 |     app_port: 8080
101 |     app_health_check_path: "/healthz" # All _ converted to - for all properties defined under daprd section
102 |     command: ["python3" "app.py"]
103 |   - app_id: backend
104 |     app_dir: ./backend/
105 |     app_protocol: GRPC
106 |     app_port: 3000
107 |     unix_domain_socket: "/tmp/test-socket"
108 |     env:
109 |       - DEBUG: false
110 |     command: ["./backend"]
111 | ```
112 | > Note: Running the dependencies for each app as contianers is out of scope for this discussion initially. We might consider that in the future.
113 | 
114 | - Each file contains a `common` object which contains `env`, `resources_dir` and `config_file` that can be used in common across all the apps defined in this YAML
115 | - There is an `apps` section that lists the different app configs.
116 | - Each app config has the following
117 |   - `app_id` application ID (mandatory field). Passed to `daprd` as `--app-id`.
118 |   - `app_dir` directory of the application (mandatory field).
119 |   - `resources_dir` (optional) directory(ies) of all dapr resources (components, resiliency policies, subscription crds) (overrides common def). Passed to `daprd` as `--resources-dir`.
120 |   - `config_file` (optional) the configuration file to be used for this app (overrides common def). Passed to `daprd` as `--config-file`.
121 |   - `app_protocol` Application protocol, HTTP, gRPC defaults to HTTP. Passed to `daprd` as `--app-protocol`.
122 |   - `app_port` port the app listens to if any. Passed to `daprd` as `--app-port`.
123 |   - other dapr run parameters (mostly pass through flags to `daprd`) All properties must have `_` as separators which will be validated(so that no unknown flags are passed) and translated to `-` for cmd line arguments for `daprd`.
124 |   - `command` ["exec" "arg1" "arg2"] format for application command
125 |   - `env` which overrides or adds to common env var defined or the shell env var passed in when `dapr compose` is called
126 | 
127 | The DAPR_HTTP_PORT and DAPR_GRPC_PORT will be passed in as extra environment variables to the application that is being run. Those flags for `daprd` can be overridden in the run configuration file above but that is optional as random ports will be assigned as needed.
128 | 
129 | ### Precedence rules
130 | 
131 | For `env` field:
132 | > Note: In addition to the defined env fields the app also gets the `DAPR_HTTP_PORT` and `DAPR_GRPC_PORT` fields.
133 | 
134 | - If no field is present, the environment variables of the current shell which executes the CLI command is passed to the app and dir.
135 | - if `env` field is present in the `common` section, in addition to the shell environment variables, the `env` map defined will be passed to all `apps` and `daprd` sidecars.
136 | - if `env` field is present only in a particular `apps` section, any shell environment variables, `env` maps from `common` section and the `env` map for the current app will be passed to both the `app` and `daprd`.
137 | - The more specific `env` key-value pairs will override the lesser specific ones i.e. `apps` section specific `env` key-value pairs will override the key-value pairs from the `common` section which will override the passed in shell environment variables.
138 | 
139 | 
140 | For each app in the `apps` section, the `resources_dir` and `config_file` values will be resolved in the following order:
141 | 
142 | - If `reosurce_dir` and/or `config_file` fields are given for any `apps[i]` configuration use that value as is.
143 | - If not, check for `.dapr` folder in the **`apps[i].app_dir`** folder. If found, use the `.dapr/resources` folder for configuring the resources and `.dapr/config.yaml` for the `daprd` configuration file(argument `--config-file` in `daprd`).
144 | - If not, check if a `resources_dir` and/or `config_file` fields are defined in the `common` section of the compose configuration file. If so use those values for those fields.
145 | - If not, default to `~/.dapr/resources/` for `resources_dir` and `~/.dapr/config.yaml` for `config_file` values.
146 | 
147 | 
148 | ## Proposed command format
149 | 
150 | Given the run configuration file defined above, there should be a way to use the configuration file and run the differnt commands and `daprd` with the configuration given in the file.
151 | 
152 | For this there will be a flag `-f, --file` that will be defined in the `dapr run` command. If the input path is a `file`, it expectes the file to have [structure defined above](#proposed-structure-for-run-configuration-file).
153 | If the path for the flag is a directory, then it expects the `dapr.yaml` file to be present in the directory with the same [structure defined above](#proposed-structure-for-run-configuration-file).
154 | 
155 | ### Interaction flow
156 | 
157 | The interaction flow for the `dapr run -f <path>` is shown as below.
158 | 
159 | ![interaction flow](./resources/0003-I-enhance-dapr-run-multiple-apps/interaction-flow-1.png)
160 | 
161 | > Note: app-id needs to be unique across all applications that have been run using `dapr run`.
162 | ### Logging options
163 | 
164 | Right `dapr run` executes as a foreground interactive process, and both the `daprd` logs and associated application logs are directly written to the STDOUT of the `dapr run` _process shell_ and it is not stored anywhere.
165 | 
166 | Considering that executing `dapr run -f <path>` will run multiple applciations, routing all the logs to STDOUT for all applications and `daprd` processes will make the STDOUT completely chaotic and the user will be overwhelmed with log output.
167 | 
168 | For example, consider two applications `order-proc` and `checkout` that are run on executing `dapr run -f <path>` and logs are routed to STDOUT:
169 | 
170 | ```
171 | <order-proc> ==APP== waiting for daprd to start
172 | <order-proc> INFO[0000] enabled gRPC tracing middleware               app_id=order-proc instance=Mukundans-MacBook-Pro.local scope=dapr.runtime.grpc.api type=log ver=1.9.3
173 | <checkout> ==APP==pinging dapr API
174 | <checkout> INFO[0000] enabled gRPC tracing middleware               app_id=checkout instance=Mukundans-MacBook-Pro.local scope=dapr.runtime.grpc.api type=log ver=1.9.3
175 | <order-proc> INFO[0000] started daprd              app_id=order-proc instance=Mukundans-MacBook-Pro.local scope=dapr.runtime.grpc.api type=log ver=1.9.3
176 | <order-proc> ==APP== starting the application
177 | <order-proc> ==APP== processing request 1
178 | <order-proc> ==APP== processing request 2
179 | <checkout> INFO[0000] started daprd               app_id=checkout instance=Mukundans-MacBook-Pro.local scope=dapr.runtime.grpc.api type=log ver=1.9.3
180 | <checkout> ==APP== processing request 2
181 | <order-proc> ==APP== processing request 3. request 3 calls dapr API
182 | <order-proc> INFO[0000] requst 3 is calls dapr API             app_id=order-proc instance=Mukundans-MacBook-Pro.local scope=dapr.runtime.grpc.api type=log ver=1.9.3
183 | <checkout> ==APP== processing request 3
184 | 
185 | ```
186 | 
187 | The logs will be as shown above, it will be chaotic to see what application is in which state and so on.
188 | 
189 | Instead of writing to STDOUT, each application and associated `daprd` process will write the logs to `{application dir}/.dapr/logs/{app id}/app_{datetime}.log`, `{application dir}/.dapr/logs/{app id}/daprd_{datetime}.log`.
190 | 
191 | ## Feature lifecycle outline
192 | 
193 | Compatibility with `dapr run` is expected to be maintained. But in certain cases there might be introduction of new behavior which might be _opt-in_ for running individual applications using `dapr run` whereas it might be _on by default_ when `dapr run -f <path>` is used.
194 | 
195 | The expectation is for this feature to be refined and stabilized over a series of releases.
196 | 
197 | ### Recommendation for initial version
198 | 
199 | - Initial implementation will only support Linux OS.
200 | - `dapr run -f <path>` will be an interactive process running completely for the lifecycle of applications, on exiting the process, all other spawned processes will also quit.
201 | - logs will be written to a predefined location. Users will need to manually tail the file. (Present in app-dir)
202 | - [optional] change the full `dapr run` command itself to honor [proposed organization for dapr projects](#proposed-structure-for-organizing-dapr-projects-locally). If not only when `-f` flag is used will the proposed organizing structure be used.
203 | 
204 | ### Changes for future releases
205 | 
206 | - Extend support for Windows and macOS.
207 | - Extend `dapr run` to have a detached mode `-d, --detach` flag. This will also be honoroed when running multiple applications using the `-f` flag.
208 | - Add support for `dapr logs` to query from saved logs when the application is being run.
209 | 
210 | 
211 | ## Completion checklist
212 | For initial version
213 | - [ ] Implement initail version of `dapr run -f <path>` feature
214 |   - [ ] Add E2E tests for this feature
215 | - [ ] Add documenteation for this feature
216 | 
217 | For later
218 | - [ ] Enahance `dapr run` command to have a detached mode
219 | - [ ] Enahance `dapr logs` command to track and output logs in self-hosted mode
220 | 


--------------------------------------------------------------------------------
/20230327-RCBS-Crypto-building-block.md:
--------------------------------------------------------------------------------
  1 | # Crypto(graphy) building block
  2 | 
  3 | - Author(s): Alessandro Segala (@ItalyPaleAle)
  4 | - Updated: 2023-03-27
  5 | 
  6 | ## Overview
  7 | 
  8 | This is a proposal for a new building block for Dapr to allow developers to leverage cryptography in a SAFE and consistent way. Goal is to expose an API that allows developers to ask Dapr to perform operations such as encrypting and decrypting messages, and calculating and verifying digital signatures.
  9 | 
 10 | ### Business problem
 11 | 
 12 | Modern applications make extensive use of cryptography, which, when implemented correctly, can make solutions safer even in case data is compromised. Even more, in certain cases the use of crypto is required to comply with industry regulations (think banking) or even with legal requirements (GDPR). However, leveraging cryptography is hard: developers need to pick the right algorithms and options, and need to learn the proper way to manage and protect keys. Additionally, there are operational complexities when teams want to limit who has access to cryptographic key material.
 13 | 
 14 | Organizations have increasingly started to leverage tools and services to perform cryptographic operations outside of applications. Examples include services such as Azure Key Vault, AWS KMS, Google Cloud KMS, etc. Customers may also use on-prem HSM products like Thales Luna. While those products/services perform the same or very similar operations, their APIs are very different.
 15 | 
 16 | This is an area where Dapr can help. Just like we're offering an abstraction on top of secret stores, we can offer an abstraction layer on top key vaults.
 17 | 
 18 | ### Solution overview
 19 | 
 20 | Benefits include:
 21 | 
 22 | - Making it easier for developers to perform cryptographic operations in a safe way. Dapr provides safeguards against using unsafe algorithms, or using algorithms with unsafe options.
 23 | - Keeping keys outside of applications. Applications never see key material, but can request the vault to perform operations with the keys.
 24 | - Allowing greater separation of concerns. By using external vaults, only authorized teams can access private/shared key materials.
 25 | - Simplify key management and key rotation. Keys are managed in the vault and outside of the application, and they can be rotated without needing the developers to be involved (or even without restarting the apps).
 26 | - Enabling better audit logging to monitor when operations are performed with keys in the vault.
 27 | 
 28 | ### APIs: High-level vs subtle
 29 | 
 30 | The building block features 2 kinds of operations:
 31 | 
 32 | - Low-level or "subtle" (term frequently used to indicate low-level crypto operations; one example is how in browsers, low-level operations are in the [`crypto.subtle`](https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto) object). These offer developers full control over the schemes that are used, allowing them to specify parameters, keys, modes of operations, etc.  
 33 |   As the "subtle" name implies, using these operations requires a certain level of understanding about what they do and how to use them safely; they are not meant to be consumed by the "general public".  
 34 |   Dapr offers these low-level operations for two reasons:
 35 |   1. Just like it is Dapr's value proposition to, for example, offer a consistent API abstracting various state stores, we will offer a consistent API to interface with key vaults. It allows using key vaults from multiple cloud providers, as well as using keys stored as Kubernetes secrets, all while interfacing with the same API.
 36 |   2. Developers working on applications that require interacting with existing and/or external solutions need these lower-level APIs to be able to maintain compatibility with them.
 37 | - High-level operations. At launch, these will cover data encryption and decryption only.  
 38 |   When using these higher-leel methods, Dapr offers a great level of abstraction. Developers just need to provide a key (symmetric or asymmetric) and can then use Dapr to encrypt/decrypt data without having to worry about anything else. Dapr will choose the best ciphers and modes of operations, offering access to an encryption scheme that is secure and flexible.
 39 | 
 40 | ### Low-level APIs
 41 | 
 42 | The new building block will feature 7 low-level APIs:
 43 | 
 44 | - `encrypt`: encrypts arbitrary data using a key stored in the vault. It supports symmetric and asymmetric ciphers, depending on the type of key in use (and the types of keys supported by the vault).
 45 | - `decrypt`: decrypts arbitrary data, performing the opposite of what `/encrypt` does.
 46 | - `wrapkey`: wraps keys using other keys stored in the vault. This is exactly like encrypting data, but it expects inputs to be formatted as keys (for example formatted as JSON Web Key) and it exposes additional algorithms not available when encrypting general data (like AES-KW)
 47 | - `unwrapkey`: un-wraps (decrypts) keys, performing the opposite of what `/wrap` does
 48 | - `sign`: signs an arbitrary message using an asymmetric key stored in the vault (we could also consider offering HMAC here, using symmetric keys, although not widely supported by the vault services)
 49 | - `verify`: verifies a digital signature over an arbitrary message, using an asymmetric key stored in the vault (same: we may be able to offer HMAC too)
 50 | - `getkey`: this can be used only with asymmetric keys stored in the vault, and returns the public part of the key
 51 | 
 52 | ### High-level APIs
 53 | 
 54 | The high-level APIs at launch will only include support for **encrypting** and **decrypting** messages (of arbitrary length), using the [Dapr encryption scheme](#dapr-encryption-scheme-daprioencv1).
 55 | 
 56 | Although these APIs will be available over both gRPC and HTTP, the gRPC implementation is **strongly** preferred, since it allows encrypting/decrypting data as a stream. The HTTP implementation requires keeping the entire message in memory, both as plaintext and ciphertext (a limitation in the HTTP protocol itself we cannot work around), which is not desirable unless users are encrypting very small files.
 57 | 
 58 | ### Components
 59 | 
 60 | Different components will be developed to perform those operations on supported backends such as the products/services listed above. Dapr would "translate" these calls into whatever format the backends require. Dapr never sees the private/shared keys, which remain safely stored inside the vaults.
 61 | 
 62 | Additionally, we will offer a "local" crypto component where keys are stored as Kubernetes secrets and cryptographic operations are performed within the Dapr sidecar. Although this is not as secure as using an external key vault, it still offers some benefits such as using standardized APIs and separation of concerns/roles with regards to key management.
 63 | 
 64 | Algorithms available will depend on what the backend vaults support, but in general developers should always find AES (encrypt/decrypt only) and RSA; when supported, we can offer also ChaCha20-Poly1305 (encrypt/decrypt only) and ECC with ECDSA or EdDSA (sign/verify only).
 65 | 
 66 | ## Related Items
 67 | 
 68 | Previous proposal as GitHub issue: dapr/dapr#4508
 69 | 
 70 | ## Data flow: runtime and components
 71 | 
 72 | ![data flow](./resources/20230327-RCBS-Crypto-building-block/data-flow.png)
 73 | 
 74 | ## Dapr encryption scheme: dapr.io/enc/v1
 75 | 
 76 | In the first version of the building block, we define 2 higher level operations to encrypt and decrypt data, in addition to low-level operations.
 77 | 
 78 | > **Sources:** The encryption scheme that Dapr uses is heavily inspired by the [Tink wire format](https://developers.google.com/tink/wire-format) (from the Tink library maintained by Google), as well as by Filippo Valsorda's [age](https://age-encryption.org/v1), and Minio's [DARE](https://github.com/minio/sio).
 79 | 
 80 | The **Dapr encryption scheme** is optimized for processing data as a stream. Data is chunked into multiple parts which are encrypted independently. This allows us to return data to callers as a stream, even when decrypting messages, being confident that we are not flushing unverified data to the client.
 81 | 
 82 | ### Key
 83 | 
 84 | Each message is encrypted with a 256-bit symmetric **File Key (FK)** that is randomly generated by Dapr for each new message. The key must be generated as 32 byte of output from a CSPRNG (such as Go's `crypto/rand.Reader`) and must not be reused for other files.
 85 | 
 86 | The FK is wrapped using a key stored in a key vault (**Key Encryption Key (KEK)**) by Dapr. The result of the wrapping operation is the **Wrapped File Key (WFK)**.  The algorithm used depends on the type of the KEK as well as the algorithms supported by the component: in order of preference:
 87 | 
 88 | - For symmetric keys:
 89 |   - AES-KW with 256-bit keys ([RFC 3394](https://www.rfc-editor.org/rfc/rfc3394.html)): `A256KW`
 90 |     - Because the File Key is 256-bit long, only 256-bit wrapping keys can be used
 91 |   - AES-CBC with keys 128-bit, 192-bit, and 256-bit: `A128CBC-NOPAD`, `A192CBC-NOPAD`, `A256CBC-NOPAD`
 92 |     - These don't use PKCS#7 padding because the File Key is 256-bit so it's a multiple of the AES block size.
 93 | - For RSA keys:
 94 |   - RSA OAEP with SHA-256: `RSA-OAEP-256`
 95 |     - Dapr doesn't impose limitations on the size of the key, and any key bigger than 1024 bits should work; however, 4096-bit keys are strongly recommended.
 96 | 
 97 | 
 98 | > In the future, we should explore how to add support for elliptic curve cryptography, for example P-256/P-384/P-521 or Curve25519, which requires performing a static ECDH key agreement.
 99 | 
100 | ### Ciphertext format
101 | 
102 | The ciphertext is formatted as:
103 | 
104 | ```text
105 | header || binary payload
106 | ```
107 | 
108 | ### Header
109 | 
110 | The **header** is human-readable and contains 3 items, each terminated by a line feed (0x0A) character:
111 | 
112 | 1. Name and version of the encryption scheme used. Currently, this is always `dapr.io/enc/v1`
113 | 2. The manifest, which is a JSON object.
114 | 3. The MAC for the header, base64-encoded
115 | 
116 | > Base64 encoding follows [RFC 4648 §4](https://datatracker.ietf.org/doc/html/rfc4648#section-4) ("standard" format, with padding included but optional when decoding)
117 | 
118 | ```text
119 | dapr.io/enc/v1
120 | {"k":"mykey","kw":1,"wfk":"hGYjwDpWEXEymSTFZ95zgX8krElb3Gqyls67R8zJA3k=","cph":1,"np":"Y3J5cHRvIQ=="}
121 | pBDKLrhAWL7IAvDKBV/v7lmbTG6AEZbf3srUN0Pnn30=
122 | ```
123 | 
124 | #### Manifest
125 | 
126 | The second line in the header is the **manifest**, which is a compact JSON object.
127 | 
128 | Its corresponding Go struct is:
129 | 
130 | ```go
131 | type Manifest struct {
132 | 	// Name of the key that can be used to decrypt the message.
133 | 	// This is optional, and if specified can be in the format `key` or `key/version`.
134 | 	KeyName string `json:"k,omitempty"`
135 | 	// ID of the wrapping algorithm used.
136 | 	// 0x01 = A256KW
137 | 	// 0x02 = A128CBC-NOPAD
138 | 	// 0x03 = A192CBC-NOPAD
139 | 	// 0x04 = A256CBC-NOPAD
140 | 	// 0x05 = RSA-OAEP-256
141 | 	KeyWrappingAlgorithm int `json:"kw"`
142 | 	// The Wrapped File Key
143 | 	WFK []byte `json:"wfk"`
144 | 	// ID of the cipher used.
145 | 	// 0x01 = AES-GCM
146 | 	// 0x02 = ChaCha20-Poly1305
147 | 	Cipher int `json:"cph"`
148 | 	// Random sequence of 7 bytes generated by a CSPRNG
149 | 	NoncePrefix []byte `json:"np"`
150 | }
151 | ```
152 | 
153 | - **`KeyName`** is the name of the key that can be used to decrypt the message. Usually this is the same as the name of the key used to encrypt the message, but when asymmetric ciphers are used, it could be different. Including a `KeyName` in the manifest is not required, but when i'ts present, it's used as the default value for the key name while decrypting the document (however, users can override this value by passing a custom one while decrypting the document).
154 | - **`Cipher`** indicates the cipher used to encrypt the actual data, and it must be an [AEAD](https://en.wikipedia.org/wiki/Authenticated_encryption#Authenticated_encryption_with_associated_data_(AEAD)) symmetric cipher.
155 |   - Dapr will choose AES-GCM as cipher by default.
156 |   - ChaCha20-Poly1305 is offered as an option for users that work with hardware that doesn't support AES-NI (such as Raspberry Pi), and needs to be enabled manually.
157 |   - In the future, we can support other authenticated ciphers such as AES-CBC with HMAC-SHA256.
158 | 
159 | #### MAC
160 | 
161 | The third and final line is the MAC for the header, which is computed with HMAC-SHA-256 over the previous 2 lines (including the final newline character) with a key that is derived from the (plain-text) File Key with HKDF-SHA-256:
162 | 
163 | ```text
164 | mac-key = HKDF-SHA-256(ikm = file key, salt = empty, info = "header")
165 | MAC = HMAC-SHA-256(key = mac-key, message = first 2 lines of the header)
166 | ```
167 | 
168 | > HKDF-SHA-256 is a key derivation function based on HMAC with SHA-256. See [RFC 5869 ("HMAC-based Extract-and-Expand Key Derivation Function (HKDF)")](https://www.rfc-editor.org/rfc/rfc5869.html). Being based on HMAC, it's not vulnerable to length-extension attacks, so we do not consider necessary using SHA-512 and truncating the output to 256-bits.
169 | 
170 | Note that there's one newline character (0x0A) at the end of the MAC, which concludes the header.
171 | 
172 | > Because each JSON encoder could produce a slightly different output, when verifying the manifest the MAC should be computed on the exact manifest string as included in the header. Verifiers should not re-encode the raw message as JSON.
173 | 
174 | ### Binary payload
175 | 
176 | The binary payload begins immediately after the header (after the 3rd newline character) and it includes the each segment of data encrypted:
177 | 
178 | ```text
179 | segment_0 || segment_1 || ... || segment_k
180 | ```
181 | 
182 | ### Segments
183 | 
184 | The plaintext is chunked into segments 64KB (65,536 bytes) each; the last segment may be shorter. Segments must never be empty, unless the entire file is empty.
185 | 
186 | > Because segments are 64KB each, and we can have up to 2^32 segments, the maximum size of the encrypted message is 256TB.
187 | 
188 | Each segment of plaintext is encrypted independently and stored together with its authentication tag:
189 | 
190 | ```text
191 | encrypted chunk || tag
192 | ```
193 | 
194 | Segments are encrypted with a **Payload Key (PK)** that is derived from the (plain-text) File Key and the nonce prefix:
195 | 
196 | ```text
197 | payload-key = HKDF-SHA-256(ikm = file key, salt = nonce prefix, info = "payload")
198 | ```
199 | 
200 | Each segment is encrypted using a different 12-byte nonce:
201 | 
202 | ```text
203 | nonce_prefix || i || last_segment
204 | ```
205 | 
206 | Where:
207 | 
208 | - `nonce_prefix` (7 bytes) is the nonce prefix from the header
209 | - `i` (4 bytes) is the sequence number, as a 32-bit unsigned integer counter, encoded as big-endian. The first segment has sequence number 0, and it increases.
210 | - `last_segment` (1 byte) is `0x01` if this is the last segment, or `0x00` otherwise
211 | 
212 | ## Components
213 | 
214 | Components in dapr/components-contrib implement low-level primitives only, while all higher-level operations are performed by the runtime, so they are executed in a consistent way across all backends/services. This is because the job of the components is limited to actually interacting with the key vaults, and everything else is best handled by the runtime.
215 | 
216 | Components are to be placed in the `crypto` folder and must implement the `SubtleCrypto` interface:
217 | 
218 | ```go
219 | import "github.com/lestrrat-go/jwx/v2/jwk"
220 | 
221 | // SubtleCrypto offers an interface to perform low-level ("subtle") cryptographic operations with keys stored in the vault
222 | type SubtleCrypto interface {
223 |     // GetKey returns the public part of a key stored in the vault
224 |     // This method returns an error if the key is symmetric
225 |     GetKey(
226 |         // Context that can be used to cancel the running operation
227 |         ctx context.Context,
228 |         // Name (or name/version) of the key to use in the key vault
229 |         key string,
230 |     ) (
231 |         // Object containing the public key
232 |         pubKey jwk.Key,
233 |         // Error
234 |         err error,
235 |     )
236 | 
237 |     // Encrypt a small message and returns the ciphertext
238 |     Encrypt(
239 |         // Context that can be used to cancel the running operation
240 |         ctx context.Context, 
241 |         // Input plaintext
242 |         plaintext []byte,
243 |         // Encryption algorithm to use
244 |         algorithm string,
245 |         // Name (or name/version) of the key to use in the key vault
246 |         key string,
247 |         // Nonce / initialization vector
248 |         // Ignored with asymmetric ciphers
249 |         nonce []byte,
250 |         // Associated Data when using AEAD ciphers
251 |         // Optional, can be nil
252 |         associatedData []byte,
253 |     ) (
254 |         // Encrypted ciphertext
255 |         ciphertext []byte,
256 |         // Authentication tag
257 |         // This is nil when not using an authenticated cipher
258 |         tag []byte,
259 |         // Error
260 |         err error,
261 |     )
262 | 
263 |     // Decrypt a small message and returns the plaintext
264 |     Decrypt(
265 |         // Context that can be used to cancel the running operation
266 |         ctx context.Context, 
267 |         // Input ciphertext
268 |         ciphertext []byte,
269 |         // Encryption algorithm to use
270 |         algorithm string,
271 |         // Name (or name/version) of the key to use in the key vault
272 |         key string,
273 |         // Nonce / initialization vector
274 |         // Ignored with asymmetric ciphers
275 |         nonce []byte,
276 |         // Authentication tag
277 |         // Ignored when not using an authenticated cipher
278 |         tag []byte,
279 |         // Associated Data when using AEAD ciphers
280 |         // Optional, can be nil
281 |         associatedData []byte,
282 |     ) (
283 |         // Decrypted plaintext
284 |         plaintext []byte,
285 |         // Error
286 |         err error,
287 |     )
288 | 
289 |     // WrapKey wraps a key
290 |     WrapKey(
291 |         // Context that can be used to cancel the running operation
292 |         ctx context.Context, 
293 |         // Key to wrap as jwk.Key object
294 |         plaintextKey jwk.Key,
295 |         // Encryption algorithm to use
296 |         algorithm string,
297 |         // Name (or name/version) of the key to use in the key vault
298 |         key string,
299 |         // Nonce / initialization vector
300 |         // Ignored with asymmetric ciphers
301 |         nonce []byte,
302 |         // Associated Data when using AEAD ciphers
303 |         // Optional, can be nil
304 |         associatedData []byte,
305 |     ) (
306 |         // Wrapped key
307 |         wrappedKey []byte,
308 |         // Authentication tag
309 |         // This is nil when not using an authenticated cipher
310 |         tag []byte,
311 |         // Error
312 |         err error,
313 |     )
314 | 
315 |     // UnwrapKey unwraps a key
316 |     UnwrapKey(
317 |         // Context that can be used to cancel the running operation
318 |         ctx context.Context, 
319 |         // Wrapped key
320 |         wrappedKey []byte,
321 |         // Encryption algorithm to use
322 |         algorithm string,
323 |         // Name (or name/version) of the key to use in the key vault
324 |         key string,
325 |         // Nonce / initialization vector
326 |         // Ignored with asymmetric ciphers
327 |         nonce []byte,
328 |         // Authentication tag
329 |         // Ignored when not using an authenticated cipher
330 |         tag []byte,
331 |         // Associated Data when using AEAD ciphers
332 |         // Optional, can be nil
333 |         associatedData []byte,
334 |     ) (
335 |         // Plaintext key
336 |         plaintextKey jwk.Key,
337 |         // Error
338 |         err error,
339 |     )
340 | 
341 |     // Sign a digest
342 |     Sign(
343 |         // Context that can be used to cancel the running operation
344 |         ctx context.Context, 
345 |         // Digest to sign
346 |         digest []byte,
347 |         // Signing algorithm to use
348 |         algorithm string,
349 |         // Name (or name/version) of the key to use in the key vault
350 |         // The key must be asymmetric
351 |         key string,
352 |     ) (
353 |         // Signature that was computed
354 |         signature []byte,
355 |         // Error
356 |         err error,
357 |     )
358 | 
359 |     // Verify a signature
360 |     Verify(
361 |         // Context that can be used to cancel the running operation
362 |         ctx context.Context, 
363 |         // Digest of the message
364 |         digest []byte,
365 |         // Signature to verify
366 |         signature []byte,
367 |         // Signing algorithm to use
368 |         algorithm string,
369 |         // Name (or name/version) of the key to use in the key vault
370 |         // The key must be asymmetric
371 |         key string,
372 |     ) (
373 |         // True if the signature is valid
374 |         valid bool,
375 |         // Error
376 |         err error,
377 |     )
378 | }
379 | ```
380 | 
381 | A few notes about all methods above:
382 | 
383 | 1. Keys are passed as `jwk.Key` objects, from the (excellent) [lestrrat-go/jwx library](https://pkg.go.dev/github.com/lestrrat-go/jwx/v2)
384 | 1. The `algorithm` should be represented as constant as defined by [RFC 7518 ("JSON Web Algorithms (JWA)")](https://www.rfc-editor.org/rfc/rfc7518.html). For the most part, Dapr components should not try to parse the value submitted by the user (unless the component is the "local" one that performs crypto operations directly), and pass whatever value directly to the key vault.
385 | 1. The `key` parameter can contain a version if keys can be versioned in the vault. The format should be `name/version`. If no version is specified, it's assumed to be the latest.
386 | 
387 | Notes on `WrapKey` and `UnwrapKey`:
388 | 
389 | 1. If the key need to be encoded (common with asymmetric keys), it needs to be encoded before being passed to the component. For example, in the runtime, RSA keys may be represented in a [`rsa.PrivateKey` object](https://pkg.go.dev/crypto/rsa#PrivateKey), and need to be encoded in PKCS#1 format. Symmetric keys can be passed as-is, as they are normally stored in a byte slice already.
390 | 1. `WrapKey` and `UnwrapKey` can be implemented on top of `Encrypt` and `Decrypt` if the underlying key vault does not have a special operation for key wrapping/unwrapping. 
391 | 
392 | Notes on `Encrypt` and `Verify`:
393 | 
394 | 1. When using an asymmetric key, these operations can be performed using the public key without hitting the key vault. However, for consistency and to ensure that we always use the last version of the key, components should always perform them in the vault. Exception could be if the key has a specific version, in which case components may opt to download the public key, cache it, and perform the operation locally.
395 | 
396 | Notes on `Sign` and `Verify`:
397 | 
398 | 1. Certain algorithms (currently `Ed25519`) do not operate on a message digest's, but rather on the message itself.
399 | 
400 | ## gRPC APIs
401 | 
402 | In the Dapr gRPC APIs, we are extending the `runtime.v1.Dapr` service to add new methods:
403 | 
404 | > Note: APIs will have "Alpha1" added while in preview
405 | 
406 | ```proto
407 | // (Existing Dapr service)
408 | service Dapr {
409 |   // SubtleGetKey returns the public part of an asymmetric key stored in the vault.
410 |   rpc SubtleGetKey(SubtleGetKeyRequest) returns (SubtleGetKeyResponse);
411 | 
412 |   // SubtleEncrypt encrypts a small message using a key stored in the vault.
413 |   rpc SubtleEncrypt(SubtleEncryptRequest) returns (SubtleEncryptResponse);
414 | 
415 |   // SubtleDecrypt decrypts a small message using a key stored in the vault.
416 |   rpc SubtleDecrypt(SubtleDecryptRequest) returns (SubtleDecryptResponse);
417 | 
418 |   // SubtleWrapKey wraps a key using a key stored in the vault.
419 |   rpc SubtleWrapKey(SubtleWrapKeyRequest) returns (SubtleWrapKeyResponse);
420 | 
421 |   // SubtleUnwrapKey unwraps a key using a key stored in the vault.
422 |   rpc SubtleUnwrapKey(SubtleUnwrapKeyRequest) returns (SubtleUnwrapKeyResponse);
423 | 
424 |   // SubtleSign signs a message using a key stored in the vault.
425 |   rpc SubtleSign(SubtleSignRequest) returns (SubtleSignResponse);
426 | 
427 |   // SubtleVerify verifies the signature of a message using a key stored in the vault.
428 |   rpc SubtleVerify(SubtleVerifyRequest) returns (SubtleVerifyResponse);
429 | 
430 |   // Encrypt encrypts a message using the Dapr encryption scheme and a key stored in the vault.
431 |   rpc Encrypt(stream EncryptRequest) returns (stream EncryptResponse);
432 | 
433 |   // Decrypt decrypts a message using the Dapr encryption scheme and a key stored in the vault.
434 |   rpc Decrypt(stream DecryptRequest) returns (stream DecryptResponse);
435 | }
436 | 
437 | // rpc SubtleGetKey(SubtleGetKeyRequest) returns (SubtleGetKeyResponse);
438 | 
439 | // SubtleGetKeyRequest is the request object for SubtleGetKey.
440 | message SubtleGetKeyRequest {
441 |   enum KeyFormat {
442 |     // PEM (SPKI) (default)
443 |     PEM = 0;
444 |     // JSON (JSON Web Key) as string
445 |     JSON = 1;
446 |   }
447 |  
448 |   // Name of the component
449 |   string component_name = 1 [json_name="component"];
450 |   // Name (or name/version) of the key to use in the key vault
451 |   string name = 2;
452 |   // Response format
453 |   KeyFormat format = 3;
454 | }
455 | 
456 | // SubtleGetKeyResponse is the response for SubtleGetKey.
457 | message SubtleGetKeyResponse {
458 |   // Name (or name/version) of the key.
459 |   // This is returned as response too in case there is a version.
460 |   string name = 1;
461 |   // Public key, encoded in the requested format
462 |   string public_key = 2 [json_name="publicKey"];
463 | }
464 | 
465 | // rpc SubtleEncrypt(SubtleEncryptRequest) returns (SubtleEncryptResponse);
466 | 
467 | // SubtleEncryptRequest is the request for SubtleEncrypt.
468 | message SubtleEncryptRequest {
469 |   // Name of the component
470 |   string component_name = 1 [json_name="component"];
471 |   // Message to encrypt.
472 |   bytes plaintext = 2;
473 |   // Algorithm to use, as in the JWA standard.
474 |   string algorithm = 3;
475 |   // Name (or name/version) of the key.
476 |   string key = 4;
477 |   // Nonce / initialization vector.
478 |   // Ignored with asymmetric ciphers.
479 |   bytes nonce = 5;
480 |   // Associated Data when using AEAD ciphers (optional).
481 |   bytes associated_data = 6 [json_name="associatedData"];
482 | }
483 | 
484 | // SubtleEncryptResponse is the response for SubtleEncrypt.
485 | message SubtleEncryptResponse {
486 |   // Encrypted ciphertext.
487 |   bytes ciphertext = 1;
488 |   // Authentication tag.
489 |   // This is nil when not using an authenticated cipher.
490 |   bytes tag = 2;
491 | }
492 | 
493 | // rpc SubtleDecrypt(SubtleDecryptRequest) returns (SubtleDecryptResponse);
494 | 
495 | // SubtleDecryptRequest is the request for SubtleDecrypt.
496 | message SubtleDecryptRequest {
497 |   // Name of the component
498 |   string component_name = 1 [json_name="component"];
499 |   // Message to decrypt.
500 |   bytes ciphertext = 2;
501 |   // Algorithm to use, as in the JWA standard.
502 |   string algorithm = 3;
503 |   // Name (or name/version) of the key.
504 |   string key = 4;
505 |   // Nonce / initialization vector.
506 |   // Ignored with asymmetric ciphers.
507 |   bytes nonce = 5;
508 |   // Authentication tag.
509 |   // This is nil when not using an authenticated cipher.
510 |   bytes tag = 6;
511 |   // Associated Data when using AEAD ciphers (optional).
512 |   bytes associated_data = 7 [json_name="associatedData"];
513 | }
514 | 
515 | // SubtleDecryptResponse is the response for SubtleDecrypt.
516 | message SubtleDecryptResponse {
517 |   // Decrypted plaintext.
518 |   bytes plaintext = 1;
519 | }
520 | 
521 | // rpc SubtleWrapKey(SubtleWrapKeyRequest) returns (SubtleWrapKeyResponse);
522 | 
523 | // SubtleWrapKeyRequest is the request for SubtleWrapKey.
524 | message SubtleWrapKeyRequest {
525 |   // Name of the component
526 |   string component_name = 1 [json_name="component"];
527 |   // Key to wrap
528 |   bytes plaintext_key = 2 [json_name="plaintextKey"];
529 |   // Algorithm to use, as in the JWA standard.
530 |   string algorithm = 3;
531 |   // Name (or name/version) of the key.
532 |   string key = 4;
533 |   // Nonce / initialization vector.
534 |   // Ignored with asymmetric ciphers.
535 |   bytes nonce = 5;
536 |   // Associated Data when using AEAD ciphers (optional).
537 |   bytes associated_data = 6 [json_name="associatedData"];
538 | }
539 | 
540 | // SubtleWrapKeyResponse is the response for SubtleWrapKey.
541 | message SubtleWrapKeyResponse {
542 |   // Wrapped key.
543 |   bytes wrapped_key = 1 [json_name="wrappedKey"];
544 |   // Authentication tag.
545 |   // This is nil when not using an authenticated cipher.
546 |   bytes tag = 2;
547 | }
548 | 
549 | // rpc SubtleUnwrapKey(SubtleUnwrapKeyRequest) returns (SubtleUnwrapKeyResponse);
550 | 
551 | // SubtleUnwrapKeyRequest is the request for SubtleUnwrapKey.
552 | message SubtleUnwrapKeyRequest {
553 |   // Name of the component
554 |   string component_name = 1 [json_name="component"];
555 |   // Wrapped key.
556 |   bytes wrapped_key = 2 [json_name="wrappedKey"];
557 |   // Algorithm to use, as in the JWA standard.
558 |   string algorithm = 3;
559 |   // Name (or name/version) of the key.
560 |   string key = 4;
561 |   // Nonce / initialization vector.
562 |   // Ignored with asymmetric ciphers.
563 |   bytes nonce = 5;
564 |   // Authentication tag.
565 |   // This is nil when not using an authenticated cipher.
566 |   bytes tag = 6;
567 |   // Associated Data when using AEAD ciphers (optional).
568 |   bytes associated_data = 7 [json_name="associatedData"];
569 | }
570 | 
571 | // SubtleUnwrapKeyResponse is the response for SubtleUnwrapKey.
572 | message SubtleUnwrapKeyResponse {
573 |   // Key in plaintext
574 |   bytes plaintext_key = 1 [json_name="plaintextKey"];
575 | }
576 | 
577 | // rpc SubtleSign(SubtleSignRequest) returns (SubtleSignResponse);
578 | 
579 | // SubtleSignRequest is the request for SubtleSign.
580 | message SubtleSignRequest {
581 |   // Name of the component
582 |   string component_name = 1 [json_name="component"];
583 |   // Digest to sign.
584 |   bytes digest = 2;
585 |   // Algorithm to use, as in the JWA standard.
586 |   string algorithm = 3;
587 |   // Name (or name/version) of the key.
588 |   string key = 4;
589 | }
590 | 
591 | // SubtleSignResponse is the response for SubtleSign.
592 | message SubtleSignResponse {
593 |   // The signature that was computed
594 |   bytes signature = 1;
595 | }
596 | 
597 | // rpc SubtleVerify(SubtleVerifyRequest) returns (SubtleVerifyResponse);
598 | 
599 | // SubtleVerifyRequest is the request for SubtleVerify.
600 | message SubtleVerifyRequest {
601 |   // Name of the component
602 |   string component_name = 1 [json_name="component"];
603 |   // Digest of the message.
604 |   bytes digest = 2;
605 |   // Signature to verify.
606 |   bytes signature = 3;
607 |   // Algorithm to use, as in the JWA standard.
608 |   string algorithm = 4;
609 |   // Name (or name/version) of the key.
610 |   string key = 5;
611 | }
612 | 
613 | // SubtleVerifyResponse is the response for SubtleVerify.
614 | message SubtleVerifyResponse {
615 |   // True if the signature is valid.
616 |   bool valid = 1;
617 | }
618 | 
619 | // rpc Encrypt(stream EncryptRequest) returns (stream EncryptResponse);
620 | 
621 | message EncryptRequest {
622 |   // Request details. Must be present in the first message only.
623 |   EncryptRequestOptions options = 1;
624 |   // Chunk of data of arbitrary size.
625 |   common.v1.StreamPayload payload = 2;
626 | }
627 | 
628 | message EncryptRequestOptions {
629 |   // Name of the component
630 |   string component_name = 1 [json_name="component"];
631 |   // Name (or name/version) of the key.
632 |   string key = 2;
633 |   // Force algorithm to use to encrypt data: "aes-gcm" or "chacha20-poly1305" (optional)
634 |   string algorithm = 10;
635 |   // If true, the encrypted document does not contain a key reference.
636 |   // In that case, calls to the Decrypt method must provide a key reference (name or name/version).
637 |   // Defaults to false.
638 |   bool omit_decryption_key_name = 11 [json_name="omitDecryptionKeyName"]; 
639 |   // Key reference to embed in the encrypted document (name or name/version).
640 |   // This is helpful if the reference of the key used to decrypt the document is different from the one used to encrypt it.
641 |   // If unset, uses the reference of the key used to encrypt the document (this is the default behavior).
642 |   // This option is ignored if omit_decryption_key_name is true.
643 |   string decryption_key = 12 [json_name="decryptionKey"];
644 | }
645 | 
646 | message EncryptResponse {
647 |   // Chunk of data.
648 |   common.v1.StreamPayload payload = 1;
649 | }
650 | 
651 | // rpc Decrypt(stream DecryptRequest) returns (stream DecryptResponse);
652 | 
653 | message DecryptRequest {
654 |   // Request details. Must be present in the first message only.
655 |   DecryptRequestOptions options = 1;
656 |   // Chunk of data of arbitrary size.
657 |   common.v1.StreamPayload payload = 2;
658 | }
659 | 
660 | message DecryptRequestOptions {
661 |   // Name of the component
662 |   string component_name = 1 [json_name="component"];
663 |   // Name (or name/version) of the key to decrypt the message.
664 |   // Overrides any key reference included in the message if present.
665 |   // This is required if the message doesn't include a key reference (i.e. was created with omit_decryption_key_name set to true).
666 |   string key = 12;
667 | }
668 | 
669 | message DecryptResponse {
670 |   // Chunk of data.
671 |   common.v1.StreamPayload payload = 1;
672 | }
673 | ```
674 | 
675 | > For the `common.v1.StreamPayload` message, see [dapr/dapr#5170](https://github.com/dapr/dapr/pull/5170)
676 | 
677 | The `Encrypt` and `Decrypt` methods are stream-based. Dapr will read from the client until it has sufficient data, and will then send back the encrypted/decrypted data to the client. Clients must thus both send data to the RPC and listen for incoming messages. SDKs can offer to consumer methods to read the data as a stream (e.g. in Go, they accept an `io.Reader` and return an `io.Reader`)
678 | 
679 | ## HTTP APIs
680 | 
681 | ### Low-level
682 | 
683 | The low-level HTTP APIs are developed in a way that is the exact "port" of the gRPC "subtle" APIs, and the contents of the request and response bodies match exactly the fields in the gRPC APIs (except for the component name in the URL).
684 | 
685 | List of HTTP endpoints and the corresponding gRPC method:
686 | 
687 | - `POST /v1.0/subtlecrypto/[component]/getkey` -> SubtleGetKey
688 | - `POST /v1.0/subtlecrypto/[component]/encrypt` -> SubtleEncrypt
689 | - `POST /v1.0/subtlecrypto/[component]/decrypt` -> SubtleDecrypt
690 | - `POST /v1.0/subtlecrypto/[component]/wrapkey` -> SubtleWrapKey
691 | - `POST /v1.0/subtlecrypto/[component]/unwrapkey` -> SubtleUnwrapKey
692 | - `POST /v1.0/subtlecrypto/[component]/sign` -> SubtleSign
693 | - `POST /v1.0/subtlecrypto/[component]/verify` -> SubtleVerify
694 | 
695 | > Note: URL will begin with `/v1.0-alpha1` while in preview
696 | 
697 | > These APIs are implemented as "Universal" APIs in Dapr, where the business logic is implemented in gRPC only, and the APIs are then exposed as HTTP using the Universal API wrapper.
698 | 
699 | ### High-level
700 | 
701 | For high-level APIs, we cannot use Universal APIs because we cannot perform bi-directional streaming with HTTP.
702 | 
703 | As mentioned earlier, using HTTP for the high-level APIs is **highly inefficient** and users will be strongly advised against doing that outside of development or testing scenarios. In fact, while the Dapr encryption scheme is designed for streaming, that is not possible when using HTTP: first, the Dapr sidecar needs to receive the entire message (e.g. plaintext while encrypting), and only after that can begin responding to the caller; this means the Dapr sidecar needs to keep the entire message in-memory.
704 | 
705 | List of high-level HTTP endpoints:
706 | 
707 | - `PUT /v1.0/crypto/[component]/encrypt`
708 |   - Query-string arguments:
709 |     - `key` (required): name–or name/version (URL-encoded)–of the key
710 |     - `algorithm` (optional): `aes-gcm` (default) or `chacha20-poly1305`
711 |   - Body: the plain-text message to encrypt (in "raw format", e.g. not using multipart/form-data)
712 |   - Response: the ciphertext (in "raw format")
713 | - `PUT /v1.0/crypto/[component]/decrypt`
714 |   - Query-string arguments:
715 |     - `key` (required): name–or name/version (URL-encoded)–of the key
716 |   - Body: the ciphertext to decrypt (in "raw format", e.g. not using multipart/form-data)
717 |   - Response: the plain-text message (in "raw format")
718 | 
719 | > Note: URL will begin with `/v1.0-alpha1` while in preview
720 | 
721 | > Note: the body is limited by Dapr's ["http-max-request-size" option](https://docs.dapr.io/operations/configuration/increase-request-size/).
722 | 


--------------------------------------------------------------------------------
/20230406-B-external-service-invocation.md:
--------------------------------------------------------------------------------
  1 | # External Service Invocation 
  2 | 
  3 | * Author(s): Samantha Coyle (@sicoyle)
  4 | * State: Ready for Implementation
  5 | * Updated: 04/06/2023
  6 | 
  7 | ## Overview
  8 | 
  9 | This is a design proposal for the requested [external service invocation feature](https://github.com/dapr/dapr/issues/4549).
 10 | 
 11 | The goal of this feature enhancement is to provide developers with a way to invoke any service of their choosing,
 12 | using the existing building blocks provided by Dapr.
 13 | 
 14 | ## Background
 15 | 
 16 | ### Motivation
 17 | We want Dapr users to be able to invoke external,
 18 | non-Daprized services with ease and flexibility.
 19 | 
 20 | ### Goals
 21 | Implement a change into `dapr/dapr` that facilities a seamless Dapr UX to allow for external service invocation using existing building blocks and feature sets.
 22 | 
 23 | ### Current Shortfalls
 24 | Currently, we have the service invocation API that allows for Dapr users to use the invoke API on the Dapr instance.
 25 | This provides many features as part of the service invocation building block such as HTTP & gPRC service invocation,
 26 | service-to-service security, resiliency, observability, access control, namespace scoping, load balancing, and service discovery.
 27 | However, the current implementation does not allow for external service invocations - which is a real bummer for many Dapr users.
 28 | 
 29 | To remind everyone of the work around many Dapr users use, there is the HTTP binding.
 30 | Dapr users can create an HTTP binding with their external URL specified,
 31 | but this approach has many downfalls that yield a less-than-desirable developer experience.
 32 | 
 33 | For additional background information,
 34 | please refer to the [external service invocation feature request](https://github.com/dapr/dapr/issues/4549).
 35 | 
 36 | ## Related Items
 37 | 
 38 | ### Related proposals 
 39 | 
 40 | Formalizing the proposal here from [this issue](https://github.com/dapr/dapr/issues/4549).
 41 | 
 42 | ## Expectations and alternatives
 43 | 
 44 | * What is in scope for this proposal?
 45 | Feature enhancement to enable external service invocation
 46 | using the existing service invocation building block allowing service communication using HTTP protocol.
 47 | 
 48 | * What is deliberately *not* in scope?
 49 | gRPC invocation as well as additional authentication, to include OAuth2,
 50 | is not within scope of this initial proposal and implementation.
 51 | 
 52 | 
 53 | * What alternatives have been considered, and why do they not solve the problem?
 54 | 1. Expanding the existing HTTP Binding.
 55 | 2. Creation of another HTTP Binding explicitly dedicated to external service invocation keeping in mind the current pain points.
 56 | 
 57 | Moving forward with the alternative approaches goes against the motivation and goal of this proposal,
 58 | as Dapr users would be missing crucial service invocation features,
 59 | be restricted on numerous avenues,
 60 | and be forced to continue abiding by an awkwardly clunky workaround.
 61 | Additional pros/cons may be found in the [linked issue's discussion](https://github.com/dapr/dapr/issues/4549#issuecomment-1414841151).
 62 | 
 63 | * Are there any trade-offs being made? (space for time, for example)
 64 | N/A
 65 | 
 66 | * What advantages / disadvantages does this proposal have? 
 67 | This proposal allows service invocation to be enabled for non-Dapr endpoints.
 68 | 
 69 | Pros:
 70 | - Extends existing service invocation implementation.
 71 | - Same feel as current user invocation process.
 72 | - Can leverage existing service invocation features like resiliency, security practices, observability.
 73 | - Leveraging a new CRD would keep our CRD setup less cluttered and easier to adjust and add to moving forward.
 74 | - With a new CRD you can add/rm endpoints programmatically via kubectl.
 75 | - Allows for user overrides such as base URL and related request information at invocation time.
 76 | 
 77 | Cons:
 78 | - Creation of an additional CRD, thus increasing the duplication of boilerplate code for its setup.
 79 | - Need to know external base URL ahead of time to configure, but that may not always be easy for end users.
 80 | - Would need to change the Dapr Operator to notify on edits for external endpoints.
 81 | 
 82 | ## Implementation Details
 83 | 
 84 | ### Design
 85 | 
 86 | How will this work, technically?
 87 | 
 88 | Allow configuration of pieces needed for external service invocation through creation of new CRD titled `HTTPEndpoint`.
 89 | It is HTTP specific in it's `Kind`.
 90 | This has benefits in being obvious upfront that it supports only `http`,
 91 | and makes it to where we do not need `spec.allowed.protocols`.
 92 | However, it would have the drawback of needing additional CRDs in the future for supporting other protocols such as `gRPC`.
 93 | The sample `yaml` file snippet below represents the proposed configuration.
 94 | 
 95 | ```
 96 | apiVersion: dapr.io/v1alpha1
 97 | kind: HTTPEndpoint
 98 | metadata:
 99 |   name: "github"
100 | spec:
101 |   baseUrl: "http://api.github.com"
102 |   headers:
103 |   - name: "Accept-Language"
104 |     value: "en-US"
105 |   - name: "Content-Type"
106 |     value: "application/json"
107 |   - name: "Authorization"
108 |     secretKeyRef:
109 |       name: "my-secret"
110 |       key: "mymetadataSecret"
111 | auth:
112 |   secretStore: "my-secretstore"
113 | ```
114 | 
115 | Noteworthy caveat:
116 | If `Authorization` header is specified,
117 | then it is assumed that the [auth-scheme](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Authorization) prefix (ie token, basic, etc)
118 | is specified within the value for the `Authorization` header field.
119 | This allows for headers to match with the existing HTTP header schema,
120 | thus leading to a better user experience that is straightforward to use.
121 | 
122 | Implementation for external service invocation will sit alongside the existing service invocation building block implementation with API changes to support external invocation.
123 | 
124 | User facing changes include overriding the URL when calling Dapr for service invocation.
125 | Users will use the existing service invocation API, but instead of using an app ID,
126 | they can use an external URL and  optionally overwrite fields at the time of invocation.
127 | 
128 | To summarize, there would be two ways of working with external service invocations:
129 | 1. The URL format programatically.
130 | This allows for convenience and includes a single HTTP call.
131 | 2. HTTPEndpoint resource creation declaratively,
132 | where the `HTTPEndpoint.Name` would be used as the AppId in the existing service invocation URL.
133 | 
134 | #### Examples
135 | 
136 | 1. URL format overwritten:
137 | `http://localhost:${daprPort}/v1.0/invoke/http://api.github.com/method/`
138 | 
139 | 2. HTTPEndpoint resource creation declaratively using the HTTPEndpoint resource definition above.
140 | `http://localhost:${daprPort}/v1.0/invoke/github/method/`
141 | 
142 | 
143 | ### Feature lifecycle outline
144 | 
145 | * Compatability guarantees
146 | This feature is fully compatible with the existing service invocation API.
147 | 
148 | * Deprecation / co-existence with existing functionality
149 | This feature will require support for external service invocations that will sit alongside and make changes to expand the existing service invocation API.
150 | 
151 | * Feature flags
152 | N/A
153 | 
154 | ### Acceptance Criteria
155 | 
156 | How will success be measured? 
157 | 
158 | * Performance targets
159 | N/A
160 | 
161 | * Compabitility requirements
162 | This feature will need to be fully compatible with existing service invocation API.
163 | In the case that a user adds an `HTTPEndpoint` with the same name as an AppId in the same namespace and performs service invocation,
164 | then the `HTTPEndpoint` will be invoked.
165 | Calls for service invocation will first check if the AppId matches an `HTTPEndpoint` CRD,
166 | and in the case that it does, then external service invocation will occur.
167 | 
168 | * Metrics
169 | Existing service invocation tracing and metrics capabilities when calling external enpoints will be fully functional.
170 | 
171 | ## Completion Checklist
172 | 
173 | What changes or actions are required to make this proposal complete?
174 | 
175 | * Code changes
176 | * Secret resolution
177 | * Tests added (e2e, unit)
178 | * SDK changes (if needed)
179 | * Documentation
180 | 
181 | 


--------------------------------------------------------------------------------
/20230511-BCIRS-error-handling-codes.md:
--------------------------------------------------------------------------------
  1 | # Dapr Error Handling/Codes 
  2 | 
  3 | * Author(s): Roberto J. Rojas
  4 | * State: Draft
  5 | * Updated: 5/11/2023
  6 | 
  7 | ## Overview
  8 | 
  9 | Across Dapr errors are surfaced for different conditions, without consistent messages, details of the error, standard formats, and no clear indication of what/where the error initiated. 
 10 | 
 11 | This makes troubleshooting and debugging quite difficult and requires a deep understanding of the parts of Dapr and how those parts interact with each other.
 12 | 
 13 | To help with the issues raised above, it would be ideal if a solution could provide:
 14 | - Greater details of errors that occured.
 15 | - Error details in a structured format.
 16 | - Consistency in the error details.
 17 | - An indication where within the Dapr execution (Init, Runtime, Components, SDKs, etc...) the error occurred.
 18 | 
 19 | ## Background
 20 | 
 21 | ## Related Items
 22 | 
 23 | ### Related proposals 
 24 | 
 25 | 
 26 | ### Related issues 
 27 | 
 28 | https://github.com/dapr/dapr/issues/6068
 29 | 
 30 | ## Expectations and alternatives
 31 | 
 32 | ## Implementation Details
 33 | 
 34 | # Design
 35 | 
 36 | ## Solution
 37 | Utilize and follow the [gRPC Richer Error Model](https://grpc.io/docs/guides/error/#richer-error-model) and [Google API Errors Model in the Design Guide](https://cloud.google.com/apis/design/errors#error_model)
 38 | 
 39 | ### Error Code Standard
 40 | The [Google API Error Model](https://cloud.google.com/apis/design/errors#error_model) has the following Protobuf format:
 41 | ```go
 42 | package google.rpc;
 43 | 
 44 | // The `Status` type defines a logical error model that is suitable for
 45 | // different programming environments, including REST APIs and RPC APIs.
 46 | message Status {
 47 |   // A simple error code that can be easily handled by the client. The
 48 |   // actual error code is defined by `google.rpc.Code`.
 49 |   int32 code = 1;
 50 | 
 51 |   // A developer-facing human-readable error message in English. It should
 52 |   // both explain the error and offer an actionable resolution to it.
 53 |   string message = 2;
 54 | 
 55 |   // Additional error information that the client code can use to handle
 56 |   // the error, such as retry info or a help link.
 57 |   repeated google.protobuf.Any details = 3;
 58 | }
 59 | ```
 60 | 
 61 | Here is one of the possible details that can be added to the above
 62 | error structure. This is defined in the [error_details.proto Protobuf](https://github.com/googleapis/googleapis/blob/master/google/rpc/error_details.proto) 
 63 | 
 64 | ```go
 65 | message ErrorInfo {
 66 |   // The reason of the error. This is a constant value that identifies the
 67 |   // proximate cause of the error. Error reasons are unique within a particular
 68 |   // domain of errors. This should be at most 63 characters and match a
 69 |   // regular expression of `[A-Z][A-Z0-9_]+[A-Z0-9]`, which represents
 70 |   // UPPER_SNAKE_CASE.
 71 |   string reason = 1;
 72 | 
 73 |   // The logical grouping to which the "reason" belongs. The error domain
 74 |   // is typically the registered service name of the tool or product that
 75 |   // generates the error. Example: "pubsub.googleapis.com". If the error is
 76 |   // generated by some common infrastructure, the error domain must be a
 77 |   // globally unique value that identifies the infrastructure. For Google API
 78 |   // infrastructure, the error domain is "googleapis.com".
 79 |   string domain = 2;
 80 | 
 81 |   // Additional structured details about this error.
 82 |   //
 83 |   // Keys should match /[a-zA-Z0-9-_]/ and be limited to 64 characters in
 84 |   // length. When identifying the current value of an exceeded limit, the units
 85 |   // should be contained in the key, not the value.  For example, rather than
 86 |   // {"instanceLimit": "100/request"}, should be returned as,
 87 |   // {"instanceLimitPerRequest": "100"}, if the client exceeds the number of
 88 |   // instances that can be created in a single (batch) request.
 89 |   map<string, string> metadata = 3;
 90 | }
 91 | ```
 92 | 
 93 | ### Error Status
 94 | The properties of the **google.rpc.Status** will be populated as following:
 95 | 
 96 | - **Code**  - Protocol level error code. These could be either gRPC or HTTP error codes. See (gRPC Codes ProtoBuf)[https://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto]
 97 |    
 98 |    Example: "InvalidArgument Code = 3", "Internal Code = 13"
 99 | 
100 | - **Message**  - Error message.
101 | - **Details**  - A set of standard error payloads for error details. These list can be found in [Error Details](https://github.com/googleapis/googleapis/blob/master/google/rpc/error_details.proto)
102 |    
103 |    Example: "ErrorInfo", "ResourceInfo"
104 | 
105 | 
106 | 
107 | Below is partial table of the Standard Error code provided by gRPC and how they map to HTTP error codes. The entire list can found in the following links:
108 | - [Google API Error Handling]https://cloud.google.com/apis/design/errors#handling_errors
109 | - (gRPC Codes ProtoBuf)[https://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto]
110 | 
111 | 
112 | |HTTP | gRPC      | Description |
113 | |---- | ----------- | ----------- |
114 | |200  |	OK		  | No error. |
115 | |400  |	INVALID_ARGUMENT |	Client specified an invalid argument. Check error message and error details for more information. |
116 | |400  |	FAILED_PRECONDITION |	Request can not be executed in the current system state, such as deleting a non-empty directory. |
117 | |400  |	OUT_OF_RANGE |	Client specified an invalid range. |
118 | |401  |	UNAUTHENTICATED	| Request not authenticated due to missing, invalid, or expired authorization credentials. |
119 | |403  |	PERMISSION_DENIED |	Client does not have sufficient permission. |
120 | |404  |	NOT_FOUND |	A specified resource is not found. |
121 | |409  |	ABORTED	| Concurrency conflict, such as read-modify-write conflict. |
122 | 
123 | 
124 | ### ErrorInfo (Required)
125 | The properties of the **type.googleapis.com/google.rpc.ErrorInfo** will be populated as following:
126 | 
127 | - **Reason** - A combination of prefix from prefix of the table below plus the error condition code.
128 |     
129 |     Example: "DAPR_STATE_" + "ETAG_MISMATCH"
130 | 
131 | - **Domain** - With the value `dapr.io`.
132 | 
133 | - **Metadata** - A key/value map/dictionary data relevant to the error condition.
134 | 
135 | `**Note:**` The metadata property **retriable** with a truthable value("true", "false", "True", "False", "TRUE", "FALSE", "1", "0") is required. 
136 | 
137 | ### ResourceInfo (Optional)
138 | The properties of the **type.googleapis.com/google.rpc.ResourceInfo** will be populated as following:
139 | 
140 | - **ResourceType** - The building block type with version.
141 |    
142 |    Example: "state.redis/v1"
143 | 
144 | - **ResourceName** - The component name.
145 |    
146 |    Example: "my-component-name"
147 | 
148 | - **Owner**   - The owner of the component.
149 | 
150 | - **Description** - Resource descrpition or error details.
151 | 
152 | 
153 | ### Error Details Prefixes
154 | The following tables shows the propsosed error codes prefixes used in the **reason** for the **google.rpc.ErrorInfo** for various Dapr building blocks:
155 | 
156 | 
157 | **INIT**
158 | | Dapr Module | Prefix |
159 | | ----------- | ----------- |
160 | | CLI         | DAPR_CLI_INIT_* |
161 | | Self-hosted | DAPR_SELF_HOSTED_INIT_* |
162 | | K8S         | DAPR_K8S_INIT_* |
163 | | Invoke      | DAPR_INVOKE_INIT_* |
164 | 
165 | **RUNTIME**
166 | | Dapr Module | Prefix |
167 | | ----------- | ----------- |
168 | | CLI         | DAPR_RUNTIME_CLI_* |
169 | | Self-hosted | DAPR_SELF_HOSTED_* |
170 | | dapr-2-dapr(gRPC) | DAPR_RUNTIME_GRPC_* |
171 | 
172 | **COMPONENTS**
173 | | Dapr Module | Prefix |
174 | | ----------- | ----------- |
175 | | PubSub              | DAPR_PUBSUB_* |
176 | | StateStore          | DAPR_STATE_* |
177 | | Bindings            | DAPR_BINDING_* |
178 | | SecretStore         | DAPR_SECRET_* |
179 | | ConfigurationStore  | DAPR_CONFIGURATION_* |
180 | | Lock                | DAPR_LOCK_* |
181 | | NameResolution      | DAPR_NAME_RESOLUTION_* |
182 | | Middleware          | DAPR_MIDDLEWARE_*|
183 | 
184 | 
185 | The following snippet shows an error status returned due to a `ETAG_MISMATCH` error condition. The **reason** is populated with `PREFIX+ERROR_CONDITION`:
186 | 
187 | ```json
188 | {
189 |   "code": 3,
190 |   "message":  "possible etag mismatch. error from state store",
191 |   "details": [
192 |     {
193 |       "@type": "type.googleapis.com/google.rpc.ErrorInfo",
194 |       "reason": "DAPR_STATE_ETAG_MISMATCH",
195 |       "domain": "dapr.io",
196 |       "metadata": {
197 |         "key": "myapp||name"
198 |       }
199 |     },
200 |     {
201 |       "@type": "type.googleapis.com/google.rpc.ResourceInfo",
202 |       "resource_type": "state.redis/v1",
203 |       "resource_name": "my-component",
204 |       "owner": "",
205 |       "description": "possible etag mismatch. error from state store"
206 |     }
207 |   ]
208 | }
209 | ```
210 | 
211 | 
212 | ### Sample Code Snippet (Go)
213 | ```go
214 | import (
215 |    ...
216 |    "google.golang.org/genproto/googleapis/rpc/errdetails"
217 |    "google.golang.org/grpc/codes"
218 |    "google.golang.org/grpc/status"
219 |    ...
220 | )
221 | ...
222 | if req.ETag != nil {
223 |   ...
224 |   ste := status.Newf(codes.InvalidArgument, messages.ErrStateGet, in.Key, in.StoreName, err.Error())
225 |   ei := errdetails.ErrorInfo{
226 | 	    Domain: "dapr.io",
227 |       Reason: "DAPR_STATE_ETAG_MISMATCH",
228 |       Metadata: map[string]string{
229 |             "storeName": in.StoreName,
230 |       },
231 |   }
232 |   ri := errdetails.ResourceInfo{
233 |       ResourceType: "state.redis/v1",
234 |       ResourceName: "my-redis-component",
235 |       Owner:        "user",
236 |       Description:  "possible etag mismatch. error from state store",
237 | 	}
238 |   ste, err2 := ste.WithDetails(&ei, &ri)
239 |   ...
240 |   return ste.Err()
241 | }
242 | ```
243 | 
244 | ### Pros
245 | - Since the Dapr Runtime is using protocol buffers as the data format, support for the richer error model is already included in most of the gRPC implementations.
246 | - This would help minimize the changes with the Dapr ecosystem.
247 | - This solution could be used to programmatically react to errors as it provides a standard structure for the errors with details.
248 | 
249 | ### Cons
250 | - Dependencies on gPRC richer error model.
251 | - Need to test gRPC implementations support for all Dapr SDKs.
252 | 
253 | 
254 | ## gRPC Richer Error Model POC
255 | For the POC I've made changes to some parts of the Dapr modules (). The POC code can be found in my GH Repo under the branch **error-codes-poc**
256 | 
257 | These are the gRPC imports used:
258 | 
259 | ```go
260 | import (
261 |   ...
262 |   "google.golang.org/genproto/googleapis/rpc/errdetails"
263 |   "google.golang.org/grpc/codes"
264 |   "google.golang.org/grpc/status"
265 |   ...
266 | )
267 | ```
268 | 
269 | The files changed for this POC:
270 | 
271 | https://github.com/robertojrojas/components-contrib/tree/error-codes-poc
272 | 
273 | - state/redis/redis.go
274 | - state/store.go
275 | 
276 | https://github.com/robertojrojas/dapr-kit/tree/error-codes-poc
277 | 
278 | - pkg/proto/customerrors/v1/customerrors.pb.go
279 | - proto/customerrors/v1/customerrors.proto
280 | - status/customerrorcodes.go
281 | - status/status.go
282 | 
283 | https://github.com/robertojrojas/dapr/tree/error-codes-poc
284 | 
285 | - pkg/diagnostics/grpc_tracing.go
286 | - pkg/grpc/api.go
287 | - pkg/http/api.go
288 | - pkg/http/responses.go
289 | 
290 | https://github.com/robertojrojas/dapr-go-sdk/tree/error-codes-poc
291 | 
292 | - client/state.go
293 | 
294 | https://github.com/robertojrojas/dapr-cli/tree/error-codes-poc
295 | 
296 | - pkg/standalone/invoke.go
297 | 
298 | https://github.com/robertojrojas/dapr-dotnet-sdk/tree/error-codes-poc
299 | 
300 | - src/Dapr.Client/DaprClientGrpc.cs
301 | 
302 | ### Feature lifecycle outline
303 | 
304 | ### Acceptance Criteria
305 | 
306 | 
307 | ## Completion Checklist
308 | 
309 | 


--------------------------------------------------------------------------------
/20230627-P-proposal-sdk-approval.md:
--------------------------------------------------------------------------------
 1 | # Cross SDK changes approval
 2 | 
 3 | * Author(s): Artur Souza (@artursouza)
 4 | * State: Ready
 5 | * Updated: 06/27/2023
 6 | 
 7 | ## Overview
 8 | 
 9 | This defines the criteria to how SDK proposal are approved.
10 | 
11 | ## Background
12 | 
13 | The SDK Spec SIG will define the spec for how SDKs operate today but there are still multiple areas where SDKs design are inconsistent due to decisions made within each SDK repository's maintainer. There needs to be a process to improve consistency across SDKs, independent of the SDK Spec SIG, since there are ongoing discussions for new features that keep SDKs drifting away from each other.
14 | 
15 | ## Proposal
16 | 
17 | Proposals that impact multiple SDKs should be presented in this repository and will be approved if at least a simple majority (50% + 1) of the maintainers of SDK repositories approve the change. The vote counted is per person and not per repository. Voting ends when the simple majority is reached; cannot be mathematically reached; or is idle (no changes, comments or votes) for more than 30 calendar days. The list of maintainers are based on the maintainers listed on GitHub as of the time of the proposal submission. Voting should be done via a comment in the proposal's pull request in this repository.
18 | 
19 | ## Approval of this proposal
20 | 
21 | As a way to kickstart this process, STC should approve this proposal following the existing voting criteria by STC, commenting on this proposal's pull request for future reference.


--------------------------------------------------------------------------------
/20230714-S-sdk-resiliency.md:
--------------------------------------------------------------------------------
  1 | # Resiliency in Dapr SDKs
  2 | 
  3 | * Author(s): Artur Souza (@artursouza)
  4 | * State: Ready for Implementation
  5 | * Updated: 07/14/2023
  6 | 
  7 | ## Overview
  8 | 
  9 | This is a design proposal to [support resiliency when SDKs invoke remote Dapr APIs](https://github.com/dapr/dapr/issues/6609).
 10 | 
 11 | This will allow applications to talk to a remote or shared sidecar, without having to rely on custom retry and timeout logic in the user's application.
 12 | 
 13 | ## Background
 14 | 
 15 | ### Motivation
 16 | - Applications to communicate to remote Dapr APIs when there is communication degradation.
 17 | - Applications to communicate to sidecar when there is degradation of the sidecar's health.
 18 | 
 19 | ### Goals
 20 | - Dapr users can talk to a remote Dapr API without having to implement resiliency logic in their app.
 21 | - System administrators don't need to have different configurations per application based on programming language, meaning the same configuration will work with every SDK.
 22 | 
 23 | ### Current Shortfalls
 24 | - Applications need to implement resiliency (retry and timeout) on top of existing SDK.
 25 | 
 26 | ## Related Items
 27 | 
 28 | ### Related proposals 
 29 | 
 30 | Formalizing the proposal here from [this issue](https://github.com/dapr/dapr/issues/6609).
 31 | 
 32 | ## Expectations and alternatives
 33 | 
 34 | * What is in scope for this proposal?
 35 | - SDKs to support a consistent (and small) set of environment variables to configure resiliency on SDKs
 36 | - Consistent set of retriable errors for gRPC and HTTP APIs.
 37 | 
 38 | * What is deliberately *not* in scope?
 39 | - Circuit Breaking
 40 | - A highly configurable spec for resiliency policies (like the CRD in runtime)
 41 | 
 42 | 
 43 | * What alternatives have been considered, and why do they not solve the problem?
 44 | 1. Leave every SDK as-is:
 45 |   - Undetermined behavior when sidecar is down or too slow. For example, the Java SDK simply gets stuck forever if there is no response from the sidecar (tested with ToxiProxy).
 46 |   - Timeout and retry needs to be implemented at the user's application.
 47 | 2. Add retry only
 48 |   - Undetermined behavior when sidecar is down or too slow. For example, the Java SDK simply gets stuck forever if there is no response from the sidecar (tested with ToxiProxy).
 49 | 3. Let each SDK decide how to handle this.
 50 |   - Inconsistent behavior and configuration for resiliency, requiring system admins to know specifics of each SDK.
 51 | 
 52 | * Are there any trade-offs being made? (space for time, for example)
 53 | 1. Simplification of retry policy, having an opinionated setting for most configuration.
 54 | 2. No support for Circuit Breaking or API health check prior to calling the Dapr API.
 55 | 
 56 | * What advantages / disadvantages does this proposal have? 
 57 | Pros:
 58 | - Bring consistency and simple set of configuration points that work cross SDKs
 59 | - Document expected behavior for SDKs regarding timeout and retries
 60 | 
 61 | Cons:
 62 | - See trade-offs mentiond above.
 63 | 
 64 | ## Implementation Details
 65 | 
 66 | ### Design
 67 | 
 68 | * `DAPR_API_MAX_RETRIES` defines the maximum number of retries, SDKs can determine which strategy will be implemented (linear, exponential backoff, etc). `0` is the default value and means no retry. `-1` or any negative value means infinite retries.
 69 | * `DAPR_API_TIMEOUT_SECONDS` defines the maximum waiting time to connect and receive a response for an HTTP or gRPC call. Defaults to `0`. `0` (or negative) are handled as "undefined" and calls might hang forever on the client side. This setting is the timeout for each API invocation and not the timeout of the aggregated time for retries. This setting can be used without retries.
 70 | * All environment variables can be overwritten via parameters to the Dapr client or at a per-request basis, in the following order (higher priority on top):
 71 |   1. Per-request parameter
 72 |   2. Parameter when instantiating a Dapr client object
 73 |   3. Properties or any other language specific configuration framework.
 74 |   4. Environment variables
 75 | * SDK to retry if error is on connection.
 76 | * SDK to retry in case of the following retriable codes:
 77 |   * gRPC: DEADLINE_EXCEEDED, UNAVAILABLE.
 78 |   * HTTP: 408, 429 (respect `Retry-After` header), 500, 502, 503, 504
 79 | * The same client should still be usable if the API goes down but is restored after any arbitrary amount of time. In other words, the unavailability of the Dapr API should not require the application to restart.
 80 | 
 81 | #### Example of implementation
 82 | 
 83 | https://github.com/dapr/java-sdk/pull/889
 84 | 
 85 | ### Feature lifecycle outline
 86 | 
 87 | * Compatability guarantees
 88 | Retries and timeouts should be disabled by default.
 89 | 
 90 | * Deprecation / co-existence with existing functionality
 91 | If customers prefer to have a more fine tuned resiliency logic, they can still achieve so by disabling the SDK resiliency and use a 3rd party library to handle retries with custom logic.
 92 | 
 93 | * Feature flags
 94 | Retries and timeouts are disabled by default with the value `0`.
 95 | 
 96 | ### Acceptance Criteria
 97 | 
 98 | How will success be measured? 
 99 | 
100 | * Performance targets
101 | N/A
102 | 
103 | * Compabitility requirements
104 | Same environment variables work with any SDK.
105 | SDKs to pass a new compatibility test (in runtime).
106 | 
107 | * Metrics
108 | N/A
109 | 
110 | ## Completion Checklist
111 | 
112 | What changes or actions are required to make this proposal complete?
113 | 
114 | * SDK changes
115 |   * Add support for new environment variable
116 |   * Add new parameters when instantiating a new Dapr client
117 |   * Add per-request optional parameters
118 |   * Add integration testing on each SDK when possible (can use ToxiProxy)
119 | * Compatibility tests
120 |   * Implement a compatibility test in runtime (similar to what was done for actor invocation)
121 | * Documentation
122 | 
123 | 


--------------------------------------------------------------------------------
/20230918-S-unified-api-token-env-variable.md:
--------------------------------------------------------------------------------
 1 | # Unify the `DAPR_API_TOKEN` env variable across all SDKs
 2 | 
 3 | * Author(s): Elena Kolevska (@elena-kolevska)
 4 | * State: Ready for Implementation
 5 | * Updated: 2023-09-18
 6 | 
 7 | ## Overview
 8 | 
 9 | This is a design proposal to unify usage of the `DAPR_API_TOKEN` variable across all SDKs.
10 | 
11 | ## Background
12 | 
13 | Currently, the `DAPR_API_TOKEN` env variable is used in some SDKs, but not in others. For example, the Python SDK uses it, but the Javascript SDK does not. This can be confusing for users, and makes it difficult to write documentation that is consistent across all SDKs. It is also an unnecessary complication for maintainers who switch between SDKs. 
14 | 
15 | And finally, with a unified environment variable across all SDKs system administrators don't need to have different configurations per application based on programming language, because the same environment variables will work with every SDK.
16 | 
17 | ## Related Items
18 | 
19 | This has already been discussed in different repos, notably in [this issue](https://github.com/dapr/java-sdk/issues/303) in the java-sdk.
20 | 
21 | We are moving to unify other environment variable across all SDKs, as discussed in [this proposal](https://github.com/dapr/proposals/blob/main/0008-S-sidecar-endpoint-tls.md).
22 | 
23 | 
24 | ## Expectations and alternatives
25 | 
26 | **What is in scope for this proposal?**  
27 | - All supported SDKs need to consistently support the `DAPR_API_TOKEN` environment variable for authentication to Dapr
28 | 
29 | **SDKs that already use the `DAPR_API_TOKEN` env variable:**  
30 | - java-sdk
31 | - go-sdk
32 | - dotnet-sdk
33 | - python-sdk
34 | - php-sdk
35 | 
36 | **SDKs that currently don't use the `DAPR_API_TOKEN` env variable:**  
37 | - js-sdk
38 | - rust-sdk
39 | - cpp-sdk
40 | 
41 | 
42 | ## Implementation Details
43 | 
44 | ### Design
45 | 
46 | - The `DAPR_API_TOKEN` environment variable will be used in all SDKs to authenticate to Dapr. The token will be passed to the Dapr sidecar via the `dapr-api-token` header.
47 | 
48 | 
49 | ### Feature lifecycle outline
50 | 
51 | - Some SDKs currently accept the a Dapr API token as an argument through the constructor. In this case, when a token is specified through the constructor of the client, it will take precedence over the environment variable.
52 | - If there is an existing environment variable for the Dapr api token by a different name, the new `DAPR_API_TOKEN` variable will take precedence.
53 | 
54 | ### Acceptance Criteria
55 | 
56 | How will success be measured? 
57 | 
58 | * Performance targets: N/A
59 | * Compabitility requirements: Same environment variables work with any SDK
60 | * Metrics: N/A
61 | 
62 | ## Completion Checklist
63 | 
64 | What changes or actions are required to make this proposal complete?
65 | 
66 | * SDK changes (if needed)
67 |     - Add support for the `DAPR_API_TOKEN` environment variable to all supported SDKs
68 |     - Add integration testing on each SDK when possible
69 | * Documentation
70 |     - Update documentation to reflect the new environment variable
71 | 
72 | 


--------------------------------------------------------------------------------
/20231024-CIR-trust-distribution.md:
--------------------------------------------------------------------------------
  1 | # Dapr Trust Distribution
  2 | 
  3 | * Author(s): joshvanl
  4 | * State: Approved
  5 | * Updated: 2023-10-24
  6 | 
  7 | ## Overview
  8 | 
  9 | This is a design proposal to implement a proper trust distribution process in Dapr.
 10 | Trust distribution will be implemented in a seamless way without downtime.
 11 | This will improve and unlock security related features.
 12 | 
 13 | ## Background
 14 | 
 15 | ### Motivation
 16 | 
 17 | To support the creation of features:
 18 | 
 19 | - Proper Certificate Authority (CA) rotation (without re-using the root's private key)
 20 | - External CA sources such as cert-manager and cloud provider CAs etc.
 21 | - Dapr multi-cluster and networking federation
 22 | 
 23 | Related issues:
 24 | - [Multicluster Kubernetes](https://github.com/dapr/dapr/issues/3460)
 25 | - [[Proposal] Support third-party CA - Integrate Cert Manager with Dapr](https://github.com/dapr/dapr/issues/3968)
 26 | - [[Proposal] Automatic root certificate rotation](https://github.com/dapr/dapr/issues/5958)
 27 | 
 28 | ### Goals
 29 | 
 30 | - Implement an active trust distribution mechanism for Dapr in Kubernetes that responds to updates.
 31 | - Trust distribution in self-hosted mode can be be implemented by the user.
 32 | - Enable root certificate rotation with no downtime.
 33 | 
 34 | ### Non-Goals
 35 | 
 36 | - Sentry implements CA root rotation.
 37 | - Implement external CA support, though this proposal will enable this feature to be developed in the future.
 38 | - Implement Dapr trust federation.
 39 | 
 40 | ### Current Shortfalls
 41 | 
 42 | Trust distribution is the act of propagating trust data to enable secure communication or networking between peers.
 43 | In the case of Dapr, this involves propagating PEM encoded CA bundle files to clients and servers in the cluster, which are then used to authenticate peers over TLS.
 44 | Today, their are two methods of CA deployment to Dapr; either Sentry generated or provided by the Dapr cluster administrator.
 45 | From the prospective of trust distribution, these two modes are functionally the same as they both result in the `dapr-trust-bundle` ConfigMap containing the CA bundle, and the `dapr-trust-bundle` Secret containing the issuer certificate chain.
 46 | From here, trust distribution for the control plane occurs by the Operator, Placement, and Injector services reading from the mounted `dapr-trust-bundle` ConfigMap.
 47 | Trust distribution for Daprds occurs from the Injector patching the Daprd container with an environment variable containing the CA Bundle originating from the `dapr-trust-bundle` ConfigMap.
 48 | 
 49 | The problem with the current strategy is that once trust is distributed once (the `dapr-trust-bundle` ConfigMap and Secret is populated), the root of trust cannot change in the cluster.
 50 | This is because trust bundles are only injected to Dapr containers at Pod creation time.
 51 | Trust anchors are also set as environment variables whose values are static for the entire duration of a unix process, meaning they cannot be dynamically updated, for example in the event of CA root rotation.
 52 | Today, Dapr pods will have to be restarted in order to pick-up a new trust bundle.
 53 | It is also undefined and untested as to whether the control plane components will successfully pick-up a new trust bundle during execution; though whether they can is irrelevant as Daprds do not also support this feature.
 54 | 
 55 | ## Solution
 56 | 
 57 | ### Trust Distributor
 58 | 
 59 | It is paramount that the entity that conducts the trust distribution is separate from the entity that issues identities from that root of trust, in Dapr's case this is Sentry.
 60 | This is because trust distribution must happen out of band of identity issuance.
 61 | An analogous to this is roots of trust of the Internet are delivered via the computers Operating System or Internet Browser, rather than fetching them from DNS servers themselves.
 62 | Similarly for example, asset SHA hashes should be downloaded from a separate source then the assert server themselves.
 63 | Decoupling these roles also has the benefit of improving separation of concerns between responsibilities from the identity issuer, and the trust distributor.
 64 | 
 65 | Trust distribution will be conducted by the Operator and written to the ConfigMap `dapr-root-ca.crt` in all Namespaces.
 66 | The Operator is a natural fit as it is not Sentry (the identity issuance server), and machinery for Kubernetes controllers already exists in the Operator today.
 67 | ConfigMaps are a natural choice as they can be mounted by Pods & containers in Kubernetes, and trust bundles are not secrets so Secrets are not appropriate.
 68 | There is also prior art to other projects distributing trust in this way, such as [Istio](https://github.com/istio/istio/blob/4c65649a9b116584281fadcaf8c3dd6b42d34036/istioctl/pkg/workload/workload_test.go#L340) and cert-manager's [trust-manager](https://github.com/cert-manager/trust-manager#example-bundle).
 69 | We can also add support for writing to Kubernetes [ClusterTrustBundles](https://github.com/kubernetes/enhancements/issues/3257), though this resource is very new, and will not be available in all target Kubernetes cluster versions.
 70 | The `dapr-root-ca.crt` ConfigMap name is consistent with Kubernetes and Istio naming.
 71 | The operator will source the root of trust to be distributed from the mounted `dapr-root-ca.crt` Secret in the control plane namespace.
 72 | The operator will watch for this mounted file for updates using [fsnotify](https://github.com/fsnotify/fsnotify), and distribute the contents to the named ConfigMap in all namespaces.
 73 | 
 74 | Once propagated, the Injector, Placement, and Dapr sidecars can mount this ConfigMap and use it as the root of trust when connecting to peers.
 75 | Similarly, when the file is updated, these services can update their local trust stores to use the new version of the bundle.
 76 | 
 77 | The Operator will need to metadata watch all Namespaces and ConfigMaps in the cluster.
 78 | The Operator should no fully inform these resources as that will massively increase the memory consumption of the Operator.
 79 | In the event of a Namespace being created, the Operator will write the ConfigMap to that namespace.
 80 | The Operator will also ensure that the `dapr-root-ca.crt` ConfigMap stays consistent with its local trust bundle version.
 81 | 
 82 | ### CA Rotation
 83 | 
 84 | CA rotation can now be solved by the new trust bundle being _appended_ to the existing `dapr-root-ca.crt` Secret in the control plane namespace.
 85 | This new bundle containing the old and new CA will be propagated to all services by the Operator, allowing for a zero downtime & graceful roll over of the CA.
 86 | The Dapr CLI will be updated to automate this task, ensuring that the new appended trust bundle has been correctly propergated to all namespaces before writing the new CA to sentry.
 87 | Checking propagation involves ensuring the new bundle contents is present at the named ConfigMap in all namespaces.
 88 | The CLI needs to take care of the fact that mounted ConfigMaps can take up to [60 seconds](https://github.com/kubernetes/kubernetes/blob/v1.26.0/pkg/kubelet/pod_workers.go#L1175C1-L1175C96) before the file is updated on the container mount, so there is some lag between the ConfigMap being updated and the trust bundle being updated in a service's trust store.
 89 | 
 90 | ### External CA Support
 91 | 
 92 | External CA support is now made easier by the fact that the external CA trust bundle can be safely written to the `dapr-cert-ca.crt` Secret in the control plane namespace.
 93 | Dapr services will now trust the external CA's root of trust.
 94 | 
 95 | ### Dapr multi-cluster and Networking Federation
 96 | 
 97 | Similarly, the trust bundle of another Dapr cluster can be appended to the existing CA bundle so that the two clusters may trust one another.
 98 | 
 99 | ### Self Hosted Mode
100 | 
101 | Self hosted mode will continue to function as before, however using a file reference for the trust bundle rather than an environment variable means that services can respond and update their trust stores on file changes.
102 | 
103 | ### Deprecation
104 | 
105 | The `DAPR_TRUST_ANCHORS` environment variable in Daprd will become deprecated, and instead favour using a file reference configured via the CLI flag `-trust-anchors-file`.
106 | For backwards compoatabiliy, the `DAPR_TRUST_ANCHORS` environment variable will continue to be supported until `v1.14`, where the Injector service will no long patch it into Daprd sidecar containers.
107 | 
108 | ## Completion Checklist
109 | 
110 | - [ ] In Kubernetes CA mode, Sentry writes its own generated CA bundle to the `dapr-root-ca.crt` Secret in the control plane namespace.
111 | - [ ] The Operator propagates this trust bundle to the `dapr-root-ca.crt` ConfigMap in all namespaces.
112 | - [ ] Placement, Injector, and Daprds all read and watch the trust anchors from the mounted `dapr-root-ca.crt` ConigMap referenced by the `-trust-anchors-file` flag's value, updating their trust stores accordingly.
113 | - [ ] Dapr CLI CA rotation command updated to respect the `dapr-root-ca.crt` Secret and append the CA bundle accordingly.
114 | 
115 | ### Acceptance Criteria
116 | - Trust distribution is active and responds to updates.
117 | - Dapr CLI `mtls renew-certificate` has been updated to implement proper CA rotation.
118 | 


--------------------------------------------------------------------------------
/20240508-S-sidecar-endpoint-tls.md:
--------------------------------------------------------------------------------
  1 | # Dapr endpoint env and TLS support in SDKs 
  2 | 
  3 | * Author(s): Artur Souza (@artursouza), Josh van Leeuwen (@JoshVanL)
  4 | * State: Ready for Implementation
  5 | * Updated: 05/08/2024
  6 | 
  7 | ## Overview
  8 | 
  9 | This is a design proposal to [support remote or shared Dapr APIs](https://github.com/dapr/dapr/issues/6035).
 10 | 
 11 | This will allow applications to talk to a remote or shared sidecar, without having to rely on localhost sidecar running per app instance. It means the communication will likely require TLS communication.
 12 | 
 13 | ## Background
 14 | 
 15 | ### Motivation
 16 | - Applications to communicate to Dapr APIs without a local sidecar.
 17 | 
 18 | ### Goals
 19 | - Dapr users can talk to a remote Dapr API without using CLI or any other tool, by just running the application with environment variables.
 20 | - System administrators don't need to have different configurations per application based on programming language, meaning the same environment variables will work with every SDK - exception is when SDK only supports HTTP or GRPC, but sysadmin can simply always setup environment variables for both protocols to guarantee consistency.
 21 | 
 22 | ### Current Shortfalls
 23 | - Inconsistency on setting up Dapr's sidecar endpoint on each SDK.
 24 | - Not every SDK support a secure endpoint.
 25 | 
 26 | ## Related Items
 27 | 
 28 | ### Related proposals 
 29 | 
 30 | Formalizing the proposal here from [this issue](https://github.com/dapr/dapr/issues/6035).
 31 | 
 32 | ## Expectations and alternatives
 33 | 
 34 | * What is in scope for this proposal?
 35 | - SDKs to support a consistent pair of environment variables to setup Dapr API
 36 | - SDKs to support TLS endpoints for Dapr API
 37 | 
 38 | * What is deliberately *not* in scope?
 39 | - SSL certificate pinning
 40 | - Have consistency of other environment variables for SDK (`DAPR_HOST`, `DAPR_SIDECAR_IP`, etc)
 41 | - Have consistency of how Dapr client is instanciated on each SDK
 42 | 
 43 | 
 44 | * What alternatives have been considered, and why do they not solve the problem?
 45 | 1. Leave every SDK as-is:
 46 |   - Not every SDK offers an environment variable to configure Dapr endpoint, forcing configuration in code
 47 |   - Environment variables per SDK, forcing sysadmin to know about each application's language use
 48 |   - Not every SDK supports TLS endpoint
 49 | 2. Add TLS support only, giving each SDK room to decide on how to expose it to the user
 50 |   - Not every SDK offers an environment variable to configure Dapr endpoint, forcing configuration in code
 51 |   - Environment variables per SDK, forcing sysadmin to know about each application's language use
 52 | 
 53 | * Are there any trade-offs being made? (space for time, for example)
 54 | 1. Leaving existing environment variables for host and port as-is per SDK, but driving consistency on this new way.
 55 | 2. Not changing Dapr's DAPR_HOST (or equivalent), DAPR_HTTP_PORT and DAPR_GRPC_PORT.
 56 | 
 57 | * What advantages / disadvantages does this proposal have? 
 58 | Pros:
 59 | - Bring consistency in Dapr API endpoint configuration cross SDKs
 60 | - Add support for TLS endpoint
 61 | 
 62 | Cons:
 63 | - Does not address existing inconsistencies in client instantiation and env variables
 64 | - Needs to define a priority between new env variables and old ones
 65 | 
 66 | ## Implementation Details
 67 | 
 68 | ### Design
 69 | 
 70 | * `DAPR_GRPC_ENDPOINT` defines entire endpoint for gRPC, not just host: `dapr-grpc.mycompany.com`. No port in the URL defaults to 443.
 71 | * `DAPR_HTTP_ENDPOINT` defines entire endpoint for HTTP, not just host: `https://dapr-http.mycompany.com`
 72 | * Port is parsed from the hostport string (`dapr.mycompany.com:8080`) or via the default port of the protocol used in the URL (80 for `plaintext` and 443 for `TLS`)
 73 | * `DAPR_GRPC_ENDPOINT` and `DAPR_HTTP_ENDPOINT` can be set at the same time since some SDKs (Java, as of now) supports both protocols at the same time and app can pick which one to use.
 74 | * `DAPR_HTTP_ENDPOINT` must be parsed and the protocol will be used by SDK to determine if communication is over TLS (if not done automatically). In summary, `https` means secure channel.
 75 | * `DAPR_GRPC_ENDPOINT` must be parsed and the query parameter will be used to determine whether the endpoint uses TLS. In summary, `?tls=true` means to use TLS. An empty query parameter defaults TLS to false. SDKs should error on unrecognised or invalid query parameters.
 76 | * `DAPR_GRPC_ENDPOINT` and `DAPR_HTTP_ENDPOINT` have priority over existing `DAPR_HOST` and `DAPR_HTTP_PORT` or `DAPR_GRPC_PORT` environment variables. Application's hardcoded values passed via constructor takes priority over any environment variable. In summary, this is the priority list (highest on top):
 77 |   1. Values passed via constructor or builder method.
 78 |   2. Properties or any other language specific configuration framework.
 79 |   3. `DAPR_GRPC_ENDPOINT` and `DAPR_HTTP_ENDPOINT`
 80 |   4. Existing `DAPR_HOST` (or equivalent, defaulting to `127.0.0.1`) + `DAPR_HTTP_PORT` or `DAPR_GRPC_PORT`
 81 | 
 82 | `DAPR_GRPC_ENDPOINT` host port parsing example:
 83 | 
 84 | ```
 85 | myhost => port=443 tls=false resolver=dns
 86 | myhost?tls=false => port=443 tls=false resolver=dns
 87 | myhost:443 => port=443 tls=false resolver=dns
 88 | myhost:1003 => port=1003 tls=false resolver=dns
 89 | myhost:1003?tls=true => port=1003 tls=true resolver=dns
 90 | dns://myhost:1003?tls=true => port=1003 tls=true resolver=dns
 91 | unix://my.sock => port=<no concept of port> tls=false resolver=unix
 92 | unix://my.sock?tls=true => port=<no concept of port> tls=true resolver=unix
 93 | http://myhost => port=80 tls=false resolver=dns
 94 | https://myhost => port=443 tls=true resolver=dns
 95 | ```
 96 | 
 97 | #### Example of implementation
 98 | 
 99 | https://github.com/dapr/java-sdk/blob/76aec01e9aa4af7a72b910d77685ddd3f0bf86f3/sdk/src/main/java/io/dapr/client/DaprClientBuilder.java#L172C3-L192
100 | 
101 | ### Feature lifecycle outline
102 | 
103 | * Compatability guarantees
104 | This feature should allow localhost definition too `http://127.0.0.1:3500`, for example.
105 | 
106 | * This feature should continue to allow using other resolvers other than DNS (e.g.
107 | `unix://`).
108 | 
109 | * Deprecation / co-existence with existing functionality
110 | This feature takes priority over existing (inconsistent) environment variables from each SDK. If app provides a hardcoded value for Dapr endpoint (via constructor, for example), it takes priority.
111 | Use of existing `DAPR_API_TOKEN` environment variables is highly encouraged for remote API but not required.
112 | 
113 | * SDKs will continue to accept the old behaviour of DAPR_GRPC_ENPOINT` with
114 |   the scheme value `https` to signal to use TLS. Where a value contains both the
115 |   `https` scheme and `?tls=false` query, SDKs will error and refuse to connect.
116 | 
117 | * Feature flags
118 | N/A
119 | 
120 | ### Acceptance Criteria
121 | 
122 | How will success be measured? 
123 | 
124 | * Performance targets
125 | N/A
126 | 
127 | * Compabitility requirements
128 | Same environment variables work with any SDK - except if protocol is not supported by given SDK.
129 | 
130 | * Metrics
131 | N/A
132 | 
133 | ## Completion Checklist
134 | 
135 | What changes or actions are required to make this proposal complete?
136 | 
137 | * SDK changes
138 |   * Add support for new environment variable
139 |   * Add integration testing on each SDK when possible
140 | * Documentation
141 | 
142 | ## Test matrix 
143 | 
144 | | URL                                                          | Endpoint string to pass to grpc client    | Hostname                          | Port | TLS | Error                                                                       |
145 | | ------------------------------------------------------------ | ----------------------------------------- | --------------------------------- | ---- | ------ | --------------------------------------------------------------------------- |
146 | | :5000                                                        | dns:localhost:5000                        | localhost                         | 5000 | FALSE  |                                                                             |
147 | | :5000?tls=false                                              | dns:localhost:5000                        | localhost                         | 5000 | FALSE  |                                                                             |
148 | | :5000?tls=true                                               | dns:localhost:5000                        | localhost                         | 5000 | TRUE   |                                                                             |
149 | | myhost                                                       | dns:myhost:443                            | myhost                            | 443  | FALSE  |                                                                             |
150 | | myhost?tls=false                                             | dns:myhost:443                            | myhost                            | 443  | FALSE  |                                                                             |
151 | | myhost?tls=true                                              | dns:myhost:443                            | myhost                            | 443  | TRUE   |                                                                             |
152 | | myhost:443                                                   | dns:myhost:443                            | myhost                            | 443  | FALSE  |                                                                             |
153 | | myhost:443?tls=false                                         | dns:myhost:443                            | myhost                            | 443  | FALSE  |                                                                             |
154 | | myhost:443?tls=true                                          | dns:myhost:443                            | myhost                            | 443  | TRUE   |                                                                             |
155 | | [http://myhost](http://myhost)                               | dns:myhost:80                             | myhost                            | 80   | FALSE  |                                                                             |
156 | | [http://myhost?tls=false](http://myhost?tls=false)           |                                           |                                   |      |        | the tls query parameter is not supported for http(s) endpoints: 'tls=false' |
157 | | [http://myhost?tls=true](http://myhost?tls=true)             |                                           |                                   |      |        | the tls query parameter is not supported for http(s) endpoints: 'tls=true'  |
158 | | [http://myhost:443](http://myhost:443)                       | dns:myhost:443                            | myhost                            | 443  | FALSE  |                                                                             |
159 | | [http://myhost:443?tls=false](http://myhost:443?tls=false)   |                                           |                                   |      |        | the tls query parameter is not supported for http(s) endpoints: 'tls=false' |
160 | | [http://myhost:443?tls=true](http://myhost:443?tls=true)     |                                           |                                   |      |        | the tls query parameter is not supported for http(s) endpoints: 'tls=true'  |
161 | | [http://myhost:5000](http://myhost:5000)                     | dns:myhost:5000                           | myhost                            | 5000 | FALSE  |                                                                             |
162 | | [http://myhost:5000?tls=false](http://myhost:5000?tls=false) |                                           |                                   |      |        | the tls query parameter is not supported for http(s) endpoints: 'tls=false' |
163 | | [http://myhost:5000?tls=true](http://myhost:5000?tls=true)   |                                           |                                   |      |        | the tls query parameter is not supported for http(s) endpoints: 'tls=true'  |
164 | | [https://myhost:443](https://myhost:443)                     | dns:myhost:443                            | myhost                            | 443  | TRUE   |                                                                             |
165 | | [https://myhost:443?tls=false](https://myhost:443?tls=false) |                                           |                                   |      |        | the tls query parameter is not supported for http(s) endpoints: 'tls=false' |
166 | | [https://myhost:443?tls=true](https://myhost:443?tls=true)   |                                           |                                   |      |        | the tls query parameter is not supported for http(s) endpoints: 'tls=true'  |
167 | | dns:myhost                                                   | dns:myhost:443                            | myhost                            | 443  | FALSE  |                                                                             |
168 | | dns:myhost?tls=false                                         | dns:myhost:443                            | myhost                            | 443  | FALSE  |                                                                             |
169 | | dns:myhost?tls=true                                          | dns:myhost:443                            | myhost                            | 443  | TRUE   |                                                                             |
170 | | dns://myauthority:53/myhost                                  | dns://myauthority:53/myhost:443           | myhost                            | 443  | FALSE  |                                                                             |
171 | | dns://myauthority:53/myhost?tls=false                        | dns://myauthority:53/myhost:443           | myhost                            | 443  | FALSE  |                                                                             |
172 | | dns://myauthority:53/myhost?tls=true                         | dns://myauthority:53/myhost:443           | myhost                            | 443  | TRUE   |                                                                             |
173 | | dns://myhost                                                 |                                           |                                   |      |        | invalid dns authority 'myhost' in URL 'dns://myhost'                        |
174 | | unix:my.sock                                                 | unix:my.sock                              | my.sock                           |      | FALSE  |                                                                             |
175 | | unix:my.sock?tls=true                                        | unix:my.sock                              | my.sock                           |      | TRUE   |                                                                             |
176 | | unix://my.sock                                               | unix://my.sock                            | my.sock                           |      | FALSE  |                                                                             |
177 | | unix://my.sock?tls=true                                      | unix://my.sock                            | my.sock                           |      | TRUE   |                                                                             |
178 | | unix-abstract:my.sock                                        | unix-abstract:my.sock                     | my.sock                           |      | FALSE  |                                                                             |
179 | | unix-abstract:my.sock?tls=true                               | unix-abstract:my.sock                     | my.sock                           |      | TRUE   |                                                                             |
180 | | vsock:mycid:5000                                             | vsock:mycid:5000                          | mycid                             | 5000 | FALSE  |                                                                             |
181 | | vsock:mycid:5000?tls=true                                    | vsock:mycid:5000                          | mycid                             | 5000 | TRUE   |                                                                             |
182 | | [2001:db8:1f70::999:de8:7648:6e8]                            | dns:[2001:db8:1f70::999:de8:7648:6e8]:443 | [2001:db8:1f70::999:de8:7648:6e8] | 443  | FALSE  |                                                                             |
183 | | dns:[2001:db8:1f70::999:de8:7648:6e8]:5000                   | dns:[2001:db8:1f70::999:de8:7648:6e8]:5000                 | [2001:db8:1f70::999:de8:7648:6e8] | 5000 | FALSE |                                                                                    |
184 | | dns:[2001:db8:1f70::999:de8:7648:6e8]:5000?abc=[]            |                                                            |                                   |      |       | Error: query parameters are not supported for gRPC endpoints: 'abc=[]'             |
185 | | dns://myauthority:53/[2001:db8:1f70::999:de8:7648:6e8]       | dns://myauthority:53/[2001:db8:1f70::999:de8:7648:6e8]:443 | [2001:db8:1f70::999:de8:7648:6e8] | 443  | FALSE |                                                                                    |
186 | | dns:[2001:db8:1f70::999:de8:7648:6e8]                        | dns:[2001:db8:1f70::999:de8:7648:6e8]:443 | [2001:db8:1f70::999:de8:7648:6e8] | 443  | FALSE  |                                                                             |
187 | | https://[2001:db8:1f70::999:de8:7648:6e8]                    | dns:[2001:db8:1f70::999:de8:7648:6e8]:80  | [2001:db8:1f70::999:de8:7648:6e8] | 80   | TRUE   |                                                                             |
188 | | https://[2001:db8:1f70::999:de8:7648:6e8]:5000               | dns:[2001:db8:1f70::999:de8:7648:6e8]:5000                 | [2001:db8:1f70::999:de8:7648:6e8] | 5000 | TRUE  |                                                                                    |
189 | | host:5000/v1/dapr                                            |                                           |                                   |      |        | paths are not supported for gRPC endpoints: '/v1/dapr'                      |
190 | | host:5000/?a=1                                               |                                           |                                   |      |        | paths are not supported for gRPC endpoints: '/'                             |
191 | | inv-scheme://myhost                                          |                                           |                                   |      |        | invalid scheme 'inv-scheme' in URL 'inv-scheme://myhost'                    |
192 | | inv-scheme:myhost:5000                                       |                                           |                                   |      |        | invalid scheme 'inv-scheme' in URL 'inv-scheme:myhost:5000'                 |
193 | 


--------------------------------------------------------------------------------
/20240517-R-http-metrics-path-matching.md:
--------------------------------------------------------------------------------
  1 | # HTTP Metrics Path Matching
  2 | 
  3 | * Author(s): @nelson-parente @jmprusi 
  4 | * State: Ready for Implementation
  5 | * Updated: 2024-05-17
  6 | 
  7 | ## Overview
  8 | 
  9 | This is a design proposal to implement a new opt-in API for path matching within Dapr HTTP metrics. By enabling path matching users can define paths that will be matched and replaced without being at risk of unbounded path cardinality and other issues that motivate the introduction of the low cardinality mode in Dapr. This will enable users to have more meaningful and manageable metrics in a controlled way. 
 10 | 
 11 | ## Background
 12 | 
 13 | In [#6723](https://github.com/dapr/dapr/issues/6723), Dapr reduced the cardinality of its HTTP metrics in order to address memory issues users reported and restrain unbounded path cardinality which posed as a security threat. This change introduced two cardinality modes (high/low) controlled by the `increasedCardinality` flag.
 14 | 
 15 | The caveat with low cardinality is that it dropped paths since they were one of the sources for the high cardinality. While this is a reasonable approach, it leads in the loss of important data needed for monitoring, performance analysis, and troubleshooting. To address this, we opened [#7719](https://github.com/dapr/dapr/issues/7719).
 16 | 
 17 | This proposal introduces an opt-in API that allows users to define the paths that matter the most, effectively adding matched paths to metrics without relying on regex's, which are known to be CPU-intensive.
 18 | 
 19 | With this API, users will be able to configure path matching through a simple interface, providing the paths they care about and tailoring metrics to their specific requirements without compromising memory and security issues.
 20 | 
 21 | ## Related Items
 22 | 
 23 | ### Related issues 
 24 | 
 25 | Initial low cardinality issue: [#6723](https://github.com/dapr/dapr/issues/6723)
 26 | Issue related with low cardinality dropped metrics data: [#7719](https://github.com/dapr/dapr/issues/7719)
 27 | 
 28 | ## Expectations and alternatives
 29 | 
 30 | The proposed solution adds value to users' observability without compromising security and memory usage. The API is designed to be simple to configure, allowing users to configure the paths they care about. We considered other regex-based solutions but these are known to be CPU-intensive and can lead to performance degradation.
 31 | 
 32 | ## Implementation Details
 33 | 
 34 | ### Solution
 35 | 
 36 | This proposal introduces an opt-in API for path matching within Dapr HTTP metrics. The goal is to offer a way to match and include paths in the metrics without relying on CPU-intensive regex's and with a guarantee that path cardinality is controlled.
 37 | 
 38 | ```yaml
 39 | spec:
 40 |   metric:
 41 |     enabled: true
 42 |     http:
 43 |       increasedCardinality: true
 44 |       pathMatching:
 45 |         - /orders/{orderID}/items/{itemID}
 46 |         - /users/{userID}
 47 |         - /categories/{categoryID}/subcategories/{subCategoryID}
 48 |         - /customers/{customerID}/orders/{orderID}
 49 | ```
 50 | 
 51 | ##### Examples
 52 | 
 53 | Examples of how the Path Matching API can be used in the metrics. The examples compare the metric `dapr_http_server_request_count` with the possible configuration combinations: low and high cardinality, with and without path matching.
 54 | 
 55 | - Low Cardinality Without Path Matching
 56 | 
 57 | 
 58 | ```yaml
 59 | http:
 60 |   increasedCardinality: false
 61 | ```
 62 | 
 63 | ```
 64 | dapr_http_server_request_count{app_id="ping",method="InvokeService/ping",status="200"} 5
 65 | ```
 66 | - Low Cardinality With Path Matching
 67 | 
 68 | ```yaml
 69 | http:
 70 |   increasedCardinality: false
 71 |   pathMatching:
 72 |     - /orders/{orderID}
 73 | ```
 74 | 
 75 | ```
 76 | dapr_http_server_request_count{app_id="ping",method="GET",path="/orders/{orderID}",status="200"} 4
 77 | dapr_http_server_request_count{app_id="ping",method="GET",path="",status="200"} 1
 78 | ```
 79 | 
 80 | - High Cardinality Without Path Matching
 81 | 
 82 | ```yaml
 83 | http:
 84 |   increasedCardinality: true
 85 | ```
 86 | 
 87 | ```
 88 | dapr_http_server_request_count{app_id="ping",method="GET",path="/items/123456",status="200"} 1
 89 | dapr_http_server_request_count{app_id="ping",method="GET",path="/orders/1234",status="200"} 1
 90 | dapr_http_server_request_count{app_id="ping",method="GET",path="/orders/12345",status="200"} 1
 91 | dapr_http_server_request_count{app_id="ping",method="GET",path="/orders/123456",status="200"} 1
 92 | dapr_http_server_request_count{app_id="ping",method="GET",path="/orders/1234567",status="200"} 1
 93 | ```
 94 | 
 95 | - High Cardinality With Path Matching
 96 | 
 97 | ```yaml
 98 | http:
 99 |   increasedCardinality: true
100 |   pathMatching:
101 |     - /orders/{orderID}
102 | ```
103 | 
104 | ```
105 | dapr_http_server_request_count{app_id="ping",method="GET",path="/items/123456",status="200"} 1
106 | dapr_http_server_request_count{app_id="ping",method="GET",path="/orders/{orderID}",status="200"} 4
107 | ```
108 | #### Features
109 | 
110 | - `pathMatching` where users can specify paths for path matching.
111 | 
112 | The path matching will use the same patterns as the Go standard library (see https://pkg.go.dev/net/http#hdr-Patterns), ensuring reliable and well-supported path matching.
113 | 
114 | When the increasedCardinality flag is set to false (default in 1.14), non-matched paths are transformed into a catch-all bucket to control and limit cardinality, preventing unbounded growth. On the other hand, when increasedCardinality is true, non-matched paths are passed through as they normally would be, allowing for potentially higher cardinality but preserving the original path data. This is the main difference in the feature behavior when used with low versus high cardinality.
115 | 
116 | This Path Matching API empowers users that rely on the metrics and observability scrapped from low cardinality that will soon be the default, providing a controlled means to manage path cardinality. Those indifferent may opt for low cardinality, while legacy high cardinality mode remains available for alternative needs.
117 | 
118 | ### Acceptance Criteria
119 | 
120 | - The Path Matching API is successfully integrated into the Dapr runtime, allowing users to enable/disable path matching via configuration.
121 | 
122 | ## Completion Checklist
123 | 
124 | - [ ] Implementation in daprd
125 | - [ ] API documentation
126 | - [ ] Integration, E2E tests
127 | 
128 | 


--------------------------------------------------------------------------------
/20240618-RCBS-Conversation-building-block.md:
--------------------------------------------------------------------------------
  1 | # Conversation building block
  2 | 
  3 | * Author(s): Loong Dai (@daixiang0)
  4 | * Updated: 2024-06-18
  5 | 
  6 | ## Overview
  7 | 
  8 | This is a proposal for a new building block for Dapr to allow developers to leverage LLM services in a consistent way. Goal is to expose an API that allows developers to ask Dapr to do request.
  9 | 
 10 | ## Background
 11 | 
 12 | Now there are many large language model servers or toolkits, which provides own APIs, like [OpenAI](https://openai.com/), [Hugging Face](https://huggingface.co/), [Kserve](https://kserve.github.io/website/latest/), [OpenVINO](https://docs.openvino.ai/) and so on.
 13 | 
 14 | For developers, it is hard to migrate from one to the other due to hardcode and API differences.
 15 | 
 16 | For startups and communities, they need to implement the popular APIs as soon as possible, or users maybe give up tring or adopting because of the cost of migration.
 17 | 
 18 | This is an area where Dapr can help. We can offer an abstraction layer on those APIs.
 19 | 
 20 | ## Component YAML
 21 | 
 22 | A component can have it's own set of attributes, like in Dapr. For example:
 23 | 
 24 | ```yaml
 25 | apiVersion: dapr.io/v1alpha1
 26 | kind: Component
 27 | metadata:
 28 |   name: chatgpt4o
 29 | spec:
 30 |   type: conversation.chatgpt
 31 |   version: v1
 32 |   metadata:
 33 |     - name: key
 34 |       value: "bfnskdlgdhklhk53adfgsfnsgmtyqdghbid34891"
 35 |     - name: model
 36 |       value: "gpt-4o"
 37 |     - name: endpoints
 38 |       value: "us.api.openai.com,eu.api.openai.com"
 39 |     - name: loadBalancingPolicy
 40 |       value: "ROUNDROBIN"
 41 | ```
 42 | 
 43 | ## gRPC APIs
 44 | 
 45 | In the Dapr gRPC APIs, we are extending the `runtime.v1.Dapr` service to add new methods:
 46 | 
 47 | > Note: APIs will have "Alpha1" added while in preview
 48 | 
 49 | > Note: The API token is stored in the component
 50 | 
 51 | ```proto
 52 | // (Existing Dapr service)
 53 | service Dapr {
 54 |   // Conversate.
 55 |   rpc Converse(stream ConversationRequest) returns (stream ConversationResponse);
 56 | }
 57 | 
 58 | // ConversationRequest is the request object for Conversation.
 59 | message ConversationRequest {
 60 |   // The name of Conversation component
 61 |   string name = 1;
 62 |   // Conversation context - the Id of an existing chat room (like in ChatGPT)
 63 |   optional string conversationContext = 2;
 64 |   // Inputs for the conversation, support multiple input in one time.
 65 |   repeated string inputs = 3;
 66 |   // Parameters for all custom fields.
 67 |   repeated google.protobuf.Any parameters = 4;
 68 | }
 69 | 
 70 | // ConversationResult is the result for one input.
 71 | message ConversationResult {
 72 |   // Result for the one conversation input.
 73 |   string result = 1;
 74 |   // Parameters for all custom fields.
 75 |   repeated google.protobuf.Any parameters = 2;
 76 | }
 77 | 
 78 | // ConversationResponse is the response for Conversation.
 79 | message ConversationResponse {
 80 |   // Conversation context - the Id of an existing or newly created chat room (like in ChatGPT)
 81 |   optional string conversationContext = 1;
 82 | 
 83 |   // An array of results.
 84 |   repeated ConversationResult outputs = 2;
 85 | }
 86 | ```
 87 | 
 88 | ## HTTP APIs
 89 | 
 90 | The HTTP APIs are same with the gRPC APIs：
 91 | 
 92 | `POST /v1.0/conversation/[component]/converse` -> Conversate
 93 | 
 94 | ```json
 95 | REQUEST = {
 96 |   "conversationContext": "fb512b84-7a1a-4fb4-8bd2-ac7d2ec45984",
 97 |   "inputs": ["what is Dapr", "Why use Dapr"],
 98 |   "parameters": {},
 99 | }
100 | 
101 | RESPONSE  = {
102 |   "conversationContext": "fb512b84-7a1a-4fb4-8bd2-ac7d2ec45984",
103 |   "outputs": {
104 |     {
105 |        "result": "Dapr is distribution application runtime ...",
106 |        "parameters": {},
107 |     },
108 |     {
109 |        "result": "Dapr can help developers ...",
110 |        "parameters": {},
111 |     }
112 | 
113 |   },
114 | }
115 | ```
116 | 
117 | > Note: URL will begin with `/v1.0-alpha1` while in preview
118 | 


--------------------------------------------------------------------------------
/20240917-BR-resiliency-error-code-retries.md:
--------------------------------------------------------------------------------
  1 | # Resiliency Policy Error Code Retries
  2 | 
  3 | * Author(s): Anton Troshin (@antontroshin), Taction (@taction)
  4 | * Updated: 2024-09-18
  5 | 
  6 | ## Overview
  7 | 
  8 | This is a design proposal to provide additional functionality for Dapr Resiliency Policy Retries to be able to enforce policy only on specific response error codes.
  9 | It only focuses on the `retries` (https://docs.dapr.io/operations/resiliency/policies/#retries) part of the policy.
 10 | 
 11 | ## Background
 12 | 
 13 | In some applications, some status codes may be used to indicate the business error, and retrying the operation might not be necessary or otherwise desirable.
 14 | Customizing retry behavior will allow a more granular way to handle error codes that suit each use case.
 15 | Currently, all errors are retried when the policy is applied.
 16 | Some status codes are not retryable, and subsequent calls will result in the same error. Avoiding these retry calls will reduce the overall number of requests, traffic, and errors.
 17 | 
 18 | ## Related Items
 19 | 
 20 | https://github.com/dapr/dapr/issues/6683
 21 | https://github.com/dapr/dapr/issues/6428
 22 | https://github.com/dapr/dapr/issues/7697
 23 | 
 24 | PR:
 25 | https://github.com/dapr/dapr/pull/7132
 26 | 
 27 | Docs:
 28 | https://github.com/dapr/docs/issues/4254
 29 | https://github.com/dapr/docs/issues/3859
 30 | 
 31 | ## Expectations and alternatives
 32 | 
 33 | * What is in scope for this proposal?
 34 |   - HTTP and gRPC Service Invocation, direct and proxied
 35 |   - Bindings
 36 |   - Pub/Sub
 37 | 
 38 | ## Implementation Details
 39 | 
 40 | ### Design
 41 | 
 42 | Add a new object field to the `retries` policy Spec to allow the user to specify the status codes that should be retried.
 43 | Separate fields for HTTP and gRPC. The new fields should be optional and will default to the existing behavior, which is to retry on all errors.
 44 | 
 45 | ### Example 1:
 46 | In this example, the retry policy will retry **_only_** on HTTP 500 and HTTP status code range 502-504 (inclusive) and gRPC status code range 2-4 (inclusive).
 47 | The rest of the status codes will not be retried.
 48 | 
 49 | ```yaml
 50 | apiVersion: dapr.io/v1alpha1
 51 | kind: Resiliency
 52 | metadata:
 53 |   name: myresiliency
 54 | scopes:
 55 |   - app1
 56 | spec:
 57 |   policies:
 58 |     retries:
 59 |       pubsubRetry:
 60 |         policy: constant
 61 |         duration: 5s
 62 |         maxRetries: 10
 63 |         matching:
 64 |           httpStatusCodes: "500,502-504"
 65 |           gRPCStatusCodes: "2-4"
 66 | ```
 67 | 
 68 | ### Example 2:
 69 | In this example, the retry policy will retry **_only_** on gRPC status code range 1-15 (inclusive).
 70 | However, this policy will not apply to the HTTP status codes, and they will be retried according to the default behavior, which is to retry on all errors.
 71 | 
 72 | ```yaml
 73 | apiVersion: dapr.io/v1alpha1
 74 | kind: Resiliency
 75 | metadata:
 76 |   name: myresiliency
 77 | scopes:
 78 |   - app1
 79 | spec:
 80 |   policies:
 81 |     retries:
 82 |       pubsubRetry:
 83 |         policy: constant
 84 |         duration: 5s
 85 |         maxRetries: 10
 86 |         matching:
 87 |           gRPCStatusCodes: "1-15"
 88 | ```
 89 | 
 90 | ### Acceptable Values
 91 | The acceptable values are the same as the ones defined in the [HTTP Status Codes](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status) and [gRPC Status Codes](https://grpc.io/docs/guides/status-codes/) documentation.
 92 | 
 93 | - HTTP: from 100 to 599
 94 | - gRPC: from 1 to 16
 95 | 
 96 | ### Setting Format
 97 | Both the `httpStatusCodes` and `gRPCStatusCodes` fields are of type string and optional and can be set to a comma-separated list of status codes and/or ranges of status codes.
 98 | The range must be in the format `<start>-<end>` (inclusive). Having more than one dash in the range is not allowed.
 99 | 
100 | ### CRD Validation
101 | 
102 | Both field values should be validated using Common Expression Language [CEL](https://kubernetes.io/docs/reference/using-api/cel/)
103 | In addition, see Kubebuilder documentation for [CRD Validation](https://book.kubebuilder.io/reference/markers/crd-validation)
104 | 
105 | ### Parsing the configuration
106 | 
107 | The configuration values will be first parsed as comma-separated lists.
108 | Each entry in the list will be then parsed as a single status code or a range of status codes.
109 | Invalid entries will be logged and the Dapr runtime will fail to start.
110 | 
111 | Example:
112 | 
113 | ```yaml
114 | apiVersion: dapr.io/v1alpha1
115 | kind: Resiliency
116 | metadata:
117 |   name: myresiliency
118 | scopes:
119 |   - app1
120 | spec:
121 |   policies:
122 |     retries:
123 |       pubsubRetry:
124 |         policy: constant
125 |         duration: 5s
126 |         maxRetries: 10
127 |         matching:
128 |           httpStatusCodes: "500,502-504,15,404-405-500,-1,0,"
129 | ```
130 | The steps to parse the configuration are:
131 | 1. Split the `httpStatusCodes` configuration string `"500,502-504,15,404-405-500,-1,0,"` by the comma character resulting in the following list: `["500", "502-504", "15", "404-405-500", "-1", "0"]` ignoring the empty strings.
132 | 2. For each entry in the list, parse it as a single status code or a range of status codes.
133 | 3. If the entry is a single status code, add it to the list of status codes to retry.
134 | 4. If the entry is a range of status codes (each field for the relevant HTTP or gRPC status codes), add all the status codes in the range to the list of status codes to retry.
135 | - 500 is **valid** code for HTTP
136 | - 502-504 **valid** range of codes for HTTP
137 | - 15 is **invalid** code for HTTP, error logged and application will fail to start
138 | - 404-405-500 is **invalid** range contains more than one dash, error logged and application will fail to start
139 | - -1 is ignored is **invalid** code for HTTP, error logged and application will fail to start
140 | - 0 is ignored is **invalid** code for HTTP, error logged and application will fail to start
141 | 
142 | ### Acceptance Criteria
143 | 
144 | Integration and unit tests will be added to verify the new functionality.
145 | 
146 | ## Completion Checklist
147 | 
148 | * Code changes
149 | * Tests added (e2e, unit)
150 | * Documentation
151 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
  1 |                                  Apache License
  2 |                            Version 2.0, January 2004
  3 |                         http://www.apache.org/licenses/
  4 | 
  5 |    TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
  6 | 
  7 |    1. Definitions.
  8 | 
  9 |       "License" shall mean the terms and conditions for use, reproduction,
 10 |       and distribution as defined by Sections 1 through 9 of this document.
 11 | 
 12 |       "Licensor" shall mean the copyright owner or entity authorized by
 13 |       the copyright owner that is granting the License.
 14 | 
 15 |       "Legal Entity" shall mean the union of the acting entity and all
 16 |       other entities that control, are controlled by, or are under common
 17 |       control with that entity. For the purposes of this definition,
 18 |       "control" means (i) the power, direct or indirect, to cause the
 19 |       direction or management of such entity, whether by contract or
 20 |       otherwise, or (ii) ownership of fifty percent (50%) or more of the
 21 |       outstanding shares, or (iii) beneficial ownership of such entity.
 22 | 
 23 |       "You" (or "Your") shall mean an individual or Legal Entity
 24 |       exercising permissions granted by this License.
 25 | 
 26 |       "Source" form shall mean the preferred form for making modifications,
 27 |       including but not limited to software source code, documentation
 28 |       source, and configuration files.
 29 | 
 30 |       "Object" form shall mean any form resulting from mechanical
 31 |       transformation or translation of a Source form, including but
 32 |       not limited to compiled object code, generated documentation,
 33 |       and conversions to other media types.
 34 | 
 35 |       "Work" shall mean the work of authorship, whether in Source or
 36 |       Object form, made available under the License, as indicated by a
 37 |       copyright notice that is included in or attached to the work
 38 |       (an example is provided in the Appendix below).
 39 | 
 40 |       "Derivative Works" shall mean any work, whether in Source or Object
 41 |       form, that is based on (or derived from) the Work and for which the
 42 |       editorial revisions, annotations, elaborations, or other modifications
 43 |       represent, as a whole, an original work of authorship. For the purposes
 44 |       of this License, Derivative Works shall not include works that remain
 45 |       separable from, or merely link (or bind by name) to the interfaces of,
 46 |       the Work and Derivative Works thereof.
 47 | 
 48 |       "Contribution" shall mean any work of authorship, including
 49 |       the original version of the Work and any modifications or additions
 50 |       to that Work or Derivative Works thereof, that is intentionally
 51 |       submitted to Licensor for inclusion in the Work by the copyright owner
 52 |       or by an individual or Legal Entity authorized to submit on behalf of
 53 |       the copyright owner. For the purposes of this definition, "submitted"
 54 |       means any form of electronic, verbal, or written communication sent
 55 |       to the Licensor or its representatives, including but not limited to
 56 |       communication on electronic mailing lists, source code control systems,
 57 |       and issue tracking systems that are managed by, or on behalf of, the
 58 |       Licensor for the purpose of discussing and improving the Work, but
 59 |       excluding communication that is conspicuously marked or otherwise
 60 |       designated in writing by the copyright owner as "Not a Contribution."
 61 | 
 62 |       "Contributor" shall mean Licensor and any individual or Legal Entity
 63 |       on behalf of whom a Contribution has been received by Licensor and
 64 |       subsequently incorporated within the Work.
 65 | 
 66 |    2. Grant of Copyright License. Subject to the terms and conditions of
 67 |       this License, each Contributor hereby grants to You a perpetual,
 68 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 69 |       copyright license to reproduce, prepare Derivative Works of,
 70 |       publicly display, publicly perform, sublicense, and distribute the
 71 |       Work and such Derivative Works in Source or Object form.
 72 | 
 73 |    3. Grant of Patent License. Subject to the terms and conditions of
 74 |       this License, each Contributor hereby grants to You a perpetual,
 75 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 76 |       (except as stated in this section) patent license to make, have made,
 77 |       use, offer to sell, sell, import, and otherwise transfer the Work,
 78 |       where such license applies only to those patent claims licensable
 79 |       by such Contributor that are necessarily infringed by their
 80 |       Contribution(s) alone or by combination of their Contribution(s)
 81 |       with the Work to which such Contribution(s) was submitted. If You
 82 |       institute patent litigation against any entity (including a
 83 |       cross-claim or counterclaim in a lawsuit) alleging that the Work
 84 |       or a Contribution incorporated within the Work constitutes direct
 85 |       or contributory patent infringement, then any patent licenses
 86 |       granted to You under this License for that Work shall terminate
 87 |       as of the date such litigation is filed.
 88 | 
 89 |    4. Redistribution. You may reproduce and distribute copies of the
 90 |       Work or Derivative Works thereof in any medium, with or without
 91 |       modifications, and in Source or Object form, provided that You
 92 |       meet the following conditions:
 93 | 
 94 |       (a) You must give any other recipients of the Work or
 95 |           Derivative Works a copy of this License; and
 96 | 
 97 |       (b) You must cause any modified files to carry prominent notices
 98 |           stating that You changed the files; and
 99 | 
100 |       (c) You must retain, in the Source form of any Derivative Works
101 |           that You distribute, all copyright, patent, trademark, and
102 |           attribution notices from the Source form of the Work,
103 |           excluding those notices that do not pertain to any part of
104 |           the Derivative Works; and
105 | 
106 |       (d) If the Work includes a "NOTICE" text file as part of its
107 |           distribution, then any Derivative Works that You distribute must
108 |           include a readable copy of the attribution notices contained
109 |           within such NOTICE file, excluding those notices that do not
110 |           pertain to any part of the Derivative Works, in at least one
111 |           of the following places: within a NOTICE text file distributed
112 |           as part of the Derivative Works; within the Source form or
113 |           documentation, if provided along with the Derivative Works; or,
114 |           within a display generated by the Derivative Works, if and
115 |           wherever such third-party notices normally appear. The contents
116 |           of the NOTICE file are for informational purposes only and
117 |           do not modify the License. You may add Your own attribution
118 |           notices within Derivative Works that You distribute, alongside
119 |           or as an addendum to the NOTICE text from the Work, provided
120 |           that such additional attribution notices cannot be construed
121 |           as modifying the License.
122 | 
123 |       You may add Your own copyright statement to Your modifications and
124 |       may provide additional or different license terms and conditions
125 |       for use, reproduction, or distribution of Your modifications, or
126 |       for any such Derivative Works as a whole, provided Your use,
127 |       reproduction, and distribution of the Work otherwise complies with
128 |       the conditions stated in this License.
129 | 
130 |    5. Submission of Contributions. Unless You explicitly state otherwise,
131 |       any Contribution intentionally submitted for inclusion in the Work
132 |       by You to the Licensor shall be under the terms and conditions of
133 |       this License, without any additional terms or conditions.
134 |       Notwithstanding the above, nothing herein shall supersede or modify
135 |       the terms of any separate license agreement you may have executed
136 |       with Licensor regarding such Contributions.
137 | 
138 |    6. Trademarks. This License does not grant permission to use the trade
139 |       names, trademarks, service marks, or product names of the Licensor,
140 |       except as required for reasonable and customary use in describing the
141 |       origin of the Work and reproducing the content of the NOTICE file.
142 | 
143 |    7. Disclaimer of Warranty. Unless required by applicable law or
144 |       agreed to in writing, Licensor provides the Work (and each
145 |       Contributor provides its Contributions) on an "AS IS" BASIS,
146 |       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 |       implied, including, without limitation, any warranties or conditions
148 |       of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 |       PARTICULAR PURPOSE. You are solely responsible for determining the
150 |       appropriateness of using or redistributing the Work and assume any
151 |       risks associated with Your exercise of permissions under this License.
152 | 
153 |    8. Limitation of Liability. In no event and under no legal theory,
154 |       whether in tort (including negligence), contract, or otherwise,
155 |       unless required by applicable law (such as deliberate and grossly
156 |       negligent acts) or agreed to in writing, shall any Contributor be
157 |       liable to You for damages, including any direct, indirect, special,
158 |       incidental, or consequential damages of any character arising as a
159 |       result of this License or out of the use or inability to use the
160 |       Work (including but not limited to damages for loss of goodwill,
161 |       work stoppage, computer failure or malfunction, or any and all
162 |       other commercial damages or losses), even if such Contributor
163 |       has been advised of the possibility of such damages.
164 | 
165 |    9. Accepting Warranty or Additional Liability. While redistributing
166 |       the Work or Derivative Works thereof, You may choose to offer,
167 |       and charge a fee for, acceptance of support, warranty, indemnity,
168 |       or other liability obligations and/or rights consistent with this
169 |       License. However, in accepting such obligations, You may act only
170 |       on Your own behalf and on Your sole responsibility, not on behalf
171 |       of any other Contributor, and only if You agree to indemnify,
172 |       defend, and hold each Contributor harmless for any liability
173 |       incurred by, or claims asserted against, such Contributor by reason
174 |       of your accepting any such warranty or additional liability.
175 | 
176 |    END OF TERMS AND CONDITIONS
177 | 
178 |    APPENDIX: How to apply the Apache License to your work.
179 | 
180 |       To apply the Apache License to your work, attach the following
181 |       boilerplate notice, with the fields enclosed by brackets "[]"
182 |       replaced with your own identifying information. (Don't include
183 |       the brackets!)  The text should be enclosed in the appropriate
184 |       comment syntax for the file format. We also recommend that a
185 |       file or class name and description of purpose be included on the
186 |       same "printed page" as the copyright notice for easier
187 |       identification within third-party archives.
188 | 
189 |    Copyright [yyyy] [name of copyright owner]
190 | 
191 |    Licensed under the Apache License, Version 2.0 (the "License");
192 |    you may not use this file except in compliance with the License.
193 |    You may obtain a copy of the License at
194 | 
195 |        http://www.apache.org/licenses/LICENSE-2.0
196 | 
197 |    Unless required by applicable law or agreed to in writing, software
198 |    distributed under the License is distributed on an "AS IS" BASIS,
199 |    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 |    See the License for the specific language governing permissions and
201 |    limitations under the License.
202 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Dapr Proposals
  2 | 
  3 | ## Introduction
  4 | 
  5 | This repository stores proposals and designs for new features in Dapr (i.e not bug fixes or minor changes) with the intention of improving visibility, historical record-keeping and maintaining a consistent process.
  6 | 
  7 | ### What types of changes warrant a proposal here?
  8 | 
  9 | As mentioned above, any significant change that needs design and a conversation around that design should go here. As a guideline, anything that would warrant a change in the Dapr SDKs would probably require a proposal. Some specific examples would include:
 10 | 
 11 | * New Dapr building blocks
 12 | * New APIs or breaking API changes (especially to a non-alpha component)
 13 | 
 14 | ## How do I create a proposal?
 15 | 
 16 | 1. Create a fork of this repository.
 17 | 2. Copy the proposal template [templates/proposal.md](templates/proposal.md) following the format outlined below.
 18 | 3. Edit the template, filling it in with the proposal (for guides, see information in `guides/`)
 19 | 5. Submit a PR to `dapr/proposals` for community review.
 20 | 
 21 | ## Proposal name format
 22 | 
 23 | Proposal file are named in the following format:
 24 | 
 25 | > `YYYYMMDD-FLAGS-description.md`
 26 | 
 27 | Where *YYYY* is a 4-digit year, MM for 2-digit month and DD for 2-digit day of when the proposal was last updated (like `20240309`, for example), and *FLAGS* is one (or possibly more)  of:
 28 | 
 29 | * B - Building block change / creation
 30 | * C - Components change / creation
 31 | * I - Affects Dapr CLI
 32 | * P - The proposal Process itself
 33 | * R - Runtime
 34 | * S - Affects SDKs
 35 | 
 36 | So, for example, a proposal to create a new building block, such as the workflow building block, might be something like `20240102-BRS-workflow-building-block.md`, whereas a change to the actor system, which does not require any changes to the SDKs themselves, would be something like `20240103-R-actor-reminder-system.md`
 37 | 
 38 | ## Proposal Process
 39 | 
 40 | * The proposal will be opened as a PR against this repository
 41 | * Proposal will be reviewed by the community and the author(s) of the proposal
 42 | * The author(s) address questions/comments from the community in the proposal and adjust the proposal based on feedback
 43 | * Once the feedback phase is complete, and a proposal has been accepted, the proposal will be merged into this repository
 44 | * An issue needs to be created in dapr/dapr created from the template in [templates/lifecycle.md](templates/lifecycle.md) to track the work that needs to be done to implement this proposal
 45 | * Release of the feature will be slated for a specific release version of Dapr
 46 | 
 47 | ### Proposal acceptance
 48 | 
 49 | To accept a proposal, the maintainers of the relevant repository must vote using comments on the respective PR. A proposal is accepted by a majority vote supporting the proposal. When this condition is true, a maintainer from the relevant repository may approve and merge the proposal PR. While everyone is encouraged to participate and drive feedback, only the maintainers of the relevant repository have binding votes. Maintainers of other repositories and community contributors can cast non-binding votes to show support. The majority vote needed is a simple majority (more than 50% of total votes).
 50 | 
 51 | ## Feature lifecycle outline
 52 | 
 53 | Features in Dapr have a lifecycle (e.g [Components](https://docs.dapr.io/operations/components/certification-lifecycle/)) and, as such, should have a defined set of milestones / requirements for progression between the lifecycle phases. For example, can a user expect from a feature when it is Alpha quality? Once that is released, what is the plan to progress from Alpha to Beta, and the subsequent expectations? What is the expectation when this feature becomes Stable? It is important to identify what functionality or perfomance guarantees we are making to users of Dapr when adding something new.
 54 | 
 55 | For example, the lifecycle expectations of a "Foobar API" that is going to replace an existing API might look something like:
 56 | 
 57 | Alpha:
 58 |  * Initial contract for the Foobar API is complete
 59 |  * Performance is expected to be >10TPS
 60 |  * Will not support serialization via XML
 61 |  * Data stored will not be compatible with old API, existing data will be unavailable through this API (will need to use old API to access old data)
 62 |  * Only available when feature flag `foobar-api` is enabled
 63 |  * No migration of existing data from the old API available
 64 | 
 65 | Beta:
 66 |  * Performance meets or exceeds 1,000TPS
 67 |  * Enabled by default, users can opt-out via feature flag / configuration
 68 |  * Existing APIs marked as deprecated
 69 |  * Migration from previous data source / format can be done manually
 70 |  * XML will be supported
 71 |  * Backwards-incompatible changes may be made
 72 | 
 73 | 
 74 | Stable:
 75 |  * Enabled by default, existing APIs have been removed fully
 76 |  * Documentation has been changed to remove previous API definitions
 77 |  * Migration from previous data source / format will be done automatically (lazily)
 78 |  * API is stable and changes will not be backwards-incompatible
 79 | 
 80 | 
 81 | 
 82 | ## Proposal Language
 83 | 
 84 | This information can be included either in the template or in a README -- and is designed to provide a common language for proposals so that the expectations are clear.
 85 | 
 86 | 
 87 | ### Terminology
 88 | 
 89 | _(This is an incomplete list and should / will be expanded over time)_
 90 | 
 91 | | Term	| Meaning                                                                                                                                                     |
 92 | |------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|
 93 | | Building block	| Capabilities that solve common developmental challenges in building distributed applications                                                                |
 94 | | API	| Application Programming Interface - functionality exposed to end-users that can be used to interact with Dapr's building blocks in the application they are building  |
 95 | | Feature |	New or enhanced functionality that is being added to Dapr |
 96 | 
 97 | ### Keywords
 98 | 
 99 | The keywords “MUST”, “MUST NOT”, "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
100 | and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
101 | 
102 | 


--------------------------------------------------------------------------------
/guides/api-design.md:
--------------------------------------------------------------------------------
  1 | # API Design Guidelines
  2 | 
  3 | * Authors: Mukundan Sundararajan (@mukundansundar), John Ewart (@johnewart)
  4 | * Updated: 10/25/2022
  5 | 
  6 | ## Proposal requirements
  7 | 
  8 | 
  9 | For any API (new or updates), the following must be included in the proposal:
 10 | 
 11 |   * Relevant high level design
 12 |   * Proposed contract for the API
 13 |     * HTTP and gRPC APIs should be consistent in behavior and user experience.
 14 |   * Identifying what additions to existing components / creation of new components are required for this API (if any)
 15 |   * Scope for current and following releases (i.e what can be expected from this iteration and what is being pushed down the road)
 16 |   * Known limitations, where applicable
 17 |     * Performance issues
 18 |     * Compatibility issues
 19 |   * Code examples (pseudocode is acceptable)
 20 | 
 21 | 
 22 | ## API Lifecycle expectations
 23 | 
 24 | APIs are expected to go through three stages in their lifetime: Alpha, Beta and Stable. For each of these phases it should be clear to a user what they can expect. In the case of an API, those expectations are:
 25 | 
 26 | * **Alpha**
 27 |    * API is not production ready yet and might contain bugs
 28 |    * Recommended for non-business-critical use only because of potential for incompatible changes in subsequent releases
 29 |    * May not be backwards-compatible with an API it intends to replace
 30 |    * May not be highly performant or support all SDKs
 31 | 
 32 | * **Beta**
 33 |    * API is not production ready yet
 34 |    * If an API moves into Beta, the intention is that it will continue on to become stable and not be removed
 35 |    * Multiple components implement the API and API contract is mostly finalized
 36 |    * Recommended for non-business-critical use only due to potentially backwards-incompatible changes in subsequent releases
 37 |    * Should have support in (at least) the _"core"_ SDKs _(i.e. Python, Go, Java)_
 38 |    * Performance should be production-ready but may not be in all cases
 39 | 
 40 | * **Stable**
 41 |    * API will not undergo backwards-incompatible changes
 42 |    * API is considered ready for production usage
 43 |    * Performance numbers are published for the API and there are tests and safeguards in place to prevent regression
 44 | 
 45 | 
 46 | ## Requirements for API changes
 47 | 
 48 | No matter if the change is a net-new API or an update to an existing API, the following is required:
 49 | 
 50 | * Changes to documentation must be identified and written
 51 | * Existing E2E and performance tests must pass
 52 | * If a new command/modifications to existing command is required to facilitate ease of use of the new API, related code must be added to the Dapr CLI
 53 | 
 54 | ### Creation of new APIs
 55 | 
 56 | All new APIs that are defined start at the Alpha stage.
 57 | 
 58 | * Both HTTP and gRPC protocols should be supported for the new API
 59 | * Documentation must be provided for the API
 60 |   * HTTP API must be added to the `Reference` section in the Dapr documentatio
 61 | * Issues should be added in `dapr/quickstarts` to create examples for the new API to enable users to easily explore the new functionality provided by the API
 62 | * If the new API is considered an _optimization_ of an existing API (say, the addition of `BulkGetSecrets` alongside `GetSecret`) then:
 63 |   * The performance improvement gained due to this API should be documented
 64 |   * Guidance must be provided to the users in docs as to when to use this API vs using the older one
 65 | * Performance tests should (though preferably must) exist for this new API
 66 | * _Should_ include new E2E tests that exercise the API
 67 | 
 68 | 
 69 | ### Updates to existing APIs
 70 | 
 71 | Depending on the phase of the existing API, the proposed changes may or may not be backwards-incompatible
 72 | 
 73 | _Backwards-**incompatible** changes_
 74 | 
 75 | * May _only_ be proposed to Alpha or Beta APIs
 76 | * Require updates to existing E2E tests to support these changes
 77 | * Breaking changes to existing Alpha or Beta APIs must be tracked and updated in docs/release notes
 78 | 
 79 | _Backwards-**compatible** changes_
 80 | 
 81 | * May be proposed to _any_ API
 82 | * Proposed changes to both the HTTP and gRPC API must be included
 83 | 
 84 | 
 85 | ## Requirements for Building Block changes
 86 | 
 87 | Finally on addition of a new API, there may be addition of the capability to either an existing component or if it is a new building block, creation of a new set of components in the `dapr/components-contrib` repo.
 88 | 
 89 | ### Creating new API as part of a new building block in `dapr/components-contrib`**
 90 | 
 91 | - Interfaces to be used by `dapr/dapr` code must be defined and agreed upon
 92 | - New building block package is defined in `components-contrib` repo, new code must only be added inside that building block package
 93 | - Conformance tests enable validating the components compliance with defined interface for the building block and creates a baseline for conformance testing any new components added. Conformance tests may be added for the new API with the understanding that it may evolve
 94 | 
 95 | 
 96 | ### Creating new API for an existing building block in `dapr/components-contrib`
 97 | 
 98 | - Interfaces changes for the new API must be defined and agreed upon
 99 | - Existing components that support the new API must be enhanced to be in compliance with the proposed interface as per the defined and agreed upon scope of the original proposal
100 | - Conformance tests must be updated
101 |   - Get sign off on a basic suite of conformance tests for the interface method(s)
102 |   - Implement the suite of conformance tests as part of the existing suite of tests for the building block
103 | - Ensure successful execution of existing conformance and certification tests for any modified components
104 | 
105 | 
106 | 
107 | ## Progression of an API/Building block
108 | 
109 | ### Alpha to Beta
110 | 
111 | In addition to the requirements that are required of any Alpha API, the following requirements must be met so that the API can graduate to Beta. For an API to be promoted to Beta, it must exist for at least one release cycle after its initial addition as Alpha. (i.e something added in 1.10 could become  Beta in 1.12, having been stabilized through 1.11)
112 | 
113 | For all APIs, the following criteria need to be met:
114 | 
115 | * E2E test with extensive positive and negative scenarios must be defined
116 | * Most (if not all) changes needed in the user facing structures must be considered to be complete (in order to reduce the number of breaking changes)
117 | * All _"core"_ SDKs must have support for this API _(i.e. Python, Go, .NET, Java)_
118 | * Documentation of the API must be completely up-to-date
119 | * Quickstarts must be defined for the API allowing users to quickly explore the API
120 | * Performance tests should be added (if not already available in Alpha stage) / updated where relevant
121 | 
122 | 
123 | For **building blocks** to progress, the following criteria are required:
124 | 
125 | * Conformance test(s) must be added(in case a new building block does not have conformance tests in the Alpha stage)/updated
126 | * Conformance tests must test both positive and negative cases (i.e deliberately attempt to break them)
127 | * Certification tests should be added to the different components and this API must be exercised in the certification tests
128 | * Multiple implementations must be present for this building block
129 | 
130 | ### Beta to Stable
131 | 
132 | In addition to the requirements for a Beta API, the following requirements must be met so that the API can graduate to Stable. Similar to the previous phase change, this API must have been in the Beta phase for at least one full release _without any breaking changes_. In addition, the following criteria apply:
133 | 
134 | * E2E scenarios must be well defined and comprehensive
135 | * Performance tests must be added(in case a new building block does not have performance tests in the Alpha/Beta stage)/updated
136 | * Expected performance data must be added to documentation
137 | 
138 | For **building blocks** to progress, the following must also be true:
139 | 
140 | * E2E tests must exercise _at least two different implementations_ of the building block's API
141 | * Conformance tests testing both positive and negative cases must be defined
142 | * Certification tests for multiple components implementing this API must be defined
143 | 


--------------------------------------------------------------------------------
/resources/0004-BIRS-distributed-scheduler/bigPicture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dapr/proposals/0a13058ce6dce66c90bbaea0c692c88ccacf3be1/resources/0004-BIRS-distributed-scheduler/bigPicture.png


--------------------------------------------------------------------------------
/resources/0004-BIRS-distributed-scheduler/pluggableSchedulerService.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dapr/proposals/0a13058ce6dce66c90bbaea0c692c88ccacf3be1/resources/0004-BIRS-distributed-scheduler/pluggableSchedulerService.png


--------------------------------------------------------------------------------
/resources/0004-BIRS-distributed-scheduler/publicDaprAPI.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dapr/proposals/0a13058ce6dce66c90bbaea0c692c88ccacf3be1/resources/0004-BIRS-distributed-scheduler/publicDaprAPI.png


--------------------------------------------------------------------------------
/resources/0004-BIRS-distributed-scheduler/sidecarToSchedulerComm.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dapr/proposals/0a13058ce6dce66c90bbaea0c692c88ccacf3be1/resources/0004-BIRS-distributed-scheduler/sidecarToSchedulerComm.png


--------------------------------------------------------------------------------
/resources/0004-BIRS-distributed-scheduler/watchJobsFlow.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dapr/proposals/0a13058ce6dce66c90bbaea0c692c88ccacf3be1/resources/0004-BIRS-distributed-scheduler/watchJobsFlow.png


--------------------------------------------------------------------------------
/resources/20221130-I-enhance-dapr-run-multiple-apps/interaction-flow-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dapr/proposals/0a13058ce6dce66c90bbaea0c692c88ccacf3be1/resources/20221130-I-enhance-dapr-run-multiple-apps/interaction-flow-1.png


--------------------------------------------------------------------------------
/resources/20230327-RCBS-Crypto-building-block/data-flow.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dapr/proposals/0a13058ce6dce66c90bbaea0c692c88ccacf3be1/resources/20230327-RCBS-Crypto-building-block/data-flow.png


--------------------------------------------------------------------------------
/resources/README.md:
--------------------------------------------------------------------------------
1 | ## What's in here?
2 | 
3 | Any files related to proposals (images, attachments, etc.) that are needed to embed into a proposal should go in here. 
4 | 
5 | Assets should be kept in a directory that matches the proposal name - i.e `20230114-R-foo-bar.md` should store its assets in `resources/20230114-R-foo-bar/`
6 | 
7 |  
8 | 


--------------------------------------------------------------------------------
/templates/lifecycle.md:
--------------------------------------------------------------------------------
 1 | # Feature name
 2 | 
 3 | > If you need more information on what needs to be completed, look in the `guides` directory for relevant guidance.
 4 | 
 5 | # Links
 6 | 
 7 | Links to any relevant resources go here:
 8 | 
 9 | * Relevant proposal
10 | * Existing issues
11 | * Milestones
12 | 
13 | # Lifecycle Expectations
14 | 
15 | ## Alpha / Beta / Stable
16 | 
17 | For each stage, identify the expectations of this feature at that stage. For example, 
18 | are there any performance issues, configuration changes or feature deprecation that will happen?
19 | 
20 | * Anticipated performance / known limitations
21 | * Compatability guarantees / requirements
22 | * Deprecation / co-existence with existing functionality
23 | * Feature flags required
24 | 
25 | # Acceptance criteria
26 | 
27 | > For each of the stages, add any specific tasks that need to be completed before this feature reaches that particular stage. If a particular item is *not needed* then a reason should be given.
28 | 
29 | ## Alpha
30 | 
31 | - [ ] Minimum of one core SDK supports this feature (.NET / Python / Go / Java)
32 | - [ ] Feature documentation added to `dapr/docs`
33 | - [ ] Telemetry data (metrics) available for this feature
34 | - [ ] Issue opened in `dapr/quickstarts` for quickstart examples to be created
35 | 
36 | Additionally, for **APIs**:
37 | 
38 | - [ ] Both HTTP and gRPC protocols implemented
39 | - [ ] HTTP API documentation added to the `Reference` section of Dapr documentation
40 | 
41 | Additionally, for **building blocks**:
42 | 
43 | - [ ] Interfaces to be used by `dapr/dapr` code defined and agreed upon
44 | - [ ] New building block package is defined in `components-contrib` repo
45 | - [ ] Conformance tests validating the components compliance added
46 | - [ ] Minimum of _one_ implementation (preferably something already in-use such as Redis if possible to reduce complexity)
47 | 
48 | 
49 | ## Beta
50 | 
51 | - [ ] E2E tests are up-to-date and comprehensive
52 | - [ ] SDK spec is updated
53 | - [ ] No major changes to the API have occurred in the last XXX time period (releases? months?)
54 | - [ ] Support in core SDKs
55 |    - [ ] Python
56 |    - [ ] Go
57 |    - [ ] Java
58 |    - [ ] .NET
59 |    - [ ] JavaScript
60 | - [ ] Documentation up-to-date with any new changes since Alpha
61 | - [ ] Quickstarts have been created
62 | - [ ] Performance tests exist but do not block builds
63 | 
64 | Additionally, for **building blocks**:
65 | 
66 | - [ ] Conformance tests updated to match any API changes that have been made
67 | - [ ] Conformance tests exercise both positive and negative cases 
68 | - [ ] Minimum of N (three?) implementations of this building block 
69 | - [ ] Certification tests for implementations 
70 | - [ ] APIs that are used in the building block also meet Beta criteria
71 | 
72 | 
73 | ## Stable 
74 | 
75 | 
76 | - [ ] Documentation is complete in `dapr/docs` with any changes since Beta
77 | - [ ] E2E scenarios well defined and comprehensive
78 | - [ ] Performance tests exist and regressions will prevent them from successfully passing
79 | - [ ] Performance data added to documentation (https://docs.dapr.io/operations/performance-and-scalability/)
80 | 
81 | 


--------------------------------------------------------------------------------
/templates/proposal.md:
--------------------------------------------------------------------------------
 1 | # Title of proposal 
 2 | 
 3 | * Author(s): [Author Name, Co-Author Name ...]
 4 | * State: {Ready for Implementation, Implemented}
 5 | * Updated: [Date]
 6 | 
 7 | ## Overview
 8 | 
 9 | A brief description of the proposal; include information such as:
10 | 
11 | * What areas are affected by this change?
12 | * What is being proposed in this document?
13 | 
14 | ## Background
15 | 
16 | This section is intended to provide the community with the reasoning behind this proposal -- why is this proposal being made? What problem is it solving for users / developers / operators and how does it solve that for them?
17 | 
18 | ## Related Items
19 | 
20 | ### Related proposals 
21 | 
22 | Links to proposals that are related to this (either due to dependency, or possibly because this will replace another proposal)
23 | 
24 | ### Related issues 
25 | 
26 | Please link to any issues that this proposal is related to, for example, are there existing bugs filed in various Dapr repositories that this will affect?
27 | 
28 | 
29 | ## Expectations and alternatives
30 | 
31 | * What is in scope for this proposal?
32 | * What is deliberately *not* in scope?
33 | * What alternatives have been considered, and why do they not solve the problem?
34 | * Are there any trade-offs being made? (space for time, for example)
35 | * What advantages / disadvantages does this proposal have? 
36 | 
37 | ## Implementation Details
38 | 
39 | ### Design
40 | 
41 | How will this work, technically? Where applicable, include: 
42 | 
43 | * Design documents
44 | * System diagrams
45 | * Code examples
46 | 
47 | ### Feature lifecycle outline
48 | 
49 | * Expectations
50 | * Compatability guarantees
51 | * Deprecation / co-existence with existing functionality
52 | * Feature flags
53 | 
54 | ### Acceptance Criteria
55 | 
56 | How will success be measured? 
57 | 
58 | * Performance targets
59 | * Compatibility requirements
60 | * Metrics
61 | 
62 | ## Completion Checklist
63 | 
64 | What changes or actions are required to make this proposal complete? Some examples:
65 | 
66 | * Code changes
67 | * Tests added (e2e, unit)
68 | * SDK changes (if needed)
69 | * Documentation
70 | 
71 | 


--------------------------------------------------------------------------------