├── .gitignore
├── README.md
├── README_data_release_2023.md
├── README_data_release_2025.md
├── architecture.png
├── environment.yml
├── papers
├── Serverless_Cold_Starts_and_Where_to_Find_Them_EuroSys_2025.pdf
└── SoCC_2023_How_does_it_function.pdf
└── src
├── demo_cold_start.ipynb
├── demo_private.ipynb
└── demo_public.ipynb
/.gitignore:
--------------------------------------------------------------------------------
1 | datasets
2 | scratch_notebooks
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
Huawei Cloud Production FaaS Trace Data Releases
2 |
3 | This repository contains public releases of Huawei Cloud production serverless FaaS traces made available to the research and academic community by Huawei's Systems Infrastructure Research (SIR) lab in Edinburgh, UK.
4 |
5 | These public traces are hashed versions of our raw production logs of tens of billions of user requests over multiple data centers and many months. We release them to enable researchers to conduct realistic simulations and train machine learning models to improve scheduling and resource allocation in cloud platforms.
6 |
7 | The data is analyzed in two papers:
8 |
9 | * EuroSys 2025: ***Serverless Cold Starts and Where to Find Them***
10 | * Links to our paper PDF: [](https://arxiv.org/abs/2410.06145) [](https://dl.acm.org/doi/10.1145/3689031.3696073) [](https://github.com/sir-lab/data-release/blob/edits/papers/Serverless_Cold_Starts_and_Where_to_Find_Them_EuroSys_2025.pdf)
11 |
12 | * ACM SoCC 2023: ***How Does It Function? Characterizing Long-term Trends in Production Serverless Workloads***
13 |
14 | * Links to our paper PDF: [](https://arxiv.org/abs/2312.10127) [](https://dl.acm.org/doi/10.1145/3620678.3624783) [](https://github.com/sir-lab/data-release/blob/main/papers/SoCC_2023_How_does_it_function.pdf)
15 | * Conference video presentation: [](https://www.youtube.com/watch?v=fNhd7vIJgRc) [](https://www.bilibili.tv/en/video/4789602907980800)
16 |
17 | ## How to download the data
18 |
19 | The datasets used in our papers can be downloaded at the links below.
20 | * Huawei Public cold start traces 2025 contains 85 billion raw user requests and 11.9 cold start events in 5 regions. There are 19 metrics per function over 31 days, as well as aggregated time series formats for convenience.
21 | * Huawei Public and Public traces 2023 contains 1.4 trillion function requests in time series format. There are 8 metrics per function over 235 days at per-minute and per-second granularity for two serverless platforms.
22 |
23 | In some cases, you may not need all files in a zip folder. You can use 7zip to drag and drop the desired files or directories without extracting the entire archive.
24 |
25 | ## Code
26 |
27 | To get started using the datasets, look at our notebooks for tips on how to load files and visualize the data.
28 | * Huawei Public cold starts 2025 notebook
29 | * Huawei Private 2023 notebook
30 | * Huawei Public 2023 notebook
31 |
32 | To run our notebooks with the required packages installed, you can install our conda environment as follows:
33 |
34 | ```bash
35 | conda env create -f environment.yml
36 | conda activate trace-analysis
37 | ```
38 |
39 | ## Contact us
40 |
41 | We welcome feedback, collaboration, or questions. Feel free to open an [Issue](https://github.com/sir-lab/data-release/issues).
42 |
43 | These traces and associated research result from a collaboration between the Systems Infrastructure Research (SIR) lab in Edinburgh (part of Huawei Research UK) and Huawei's YuanRong serverless cloud platform team.
--------------------------------------------------------------------------------
/README_data_release_2023.md:
--------------------------------------------------------------------------------
1 | Huawei Public Cloud and Huawei Private Cloud data release 2023
2 |
3 | This is the repository for ***How Does It Function? Characterizing Long-term Trends in Production Serverless Workloads*** published at ACM SoCC 2023.
4 |
5 | The paper analyzes the Huawei Public Cloud and Huawei Private Cloud datasets, which are available for download below.
6 |
7 | We also provide two Jupyter Notebooks that show how to load the data as a Pandas DataFrame and make plots.
8 |
9 | ### Download/read our paper
10 |
11 | * ACM Library
12 | * arXiv
13 | * GitHub
14 |
15 | ### Conference video presentation
16 |
17 | * YouTube
18 | * BiliBili
19 |
20 | ## Code
21 |
22 | To get started using the datasets, look at our notebooks for tips on how to load files and visualize the data.
23 | * Huawei Private 2023 notebook
24 | * Huawei Public 2023 notebook
25 |
26 |
27 | ## How to download the data
28 |
29 | The datasets used in our paper can be downloaded here:
30 |
31 | ### Huawei Private
32 |
33 | This dataset contains 141 days (collected over 235 days) for 200 functions from all availability zones combined of our private cloud.
34 |
35 | |Metric |Minute |Second |Description |
36 | |---------------|---------------|------------|---------------|
37 | |Function ID | - | - |Unique function identifier out of 200 (0-199) |
38 | |Timestamp | - | - | Timestamp in seconds (0-20303940) |
39 | |Requests |[Requests per minute](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/private_dataset/requests_minute.zip) |[Requests per second](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/private_dataset/requests_second.zip) | Number of function invocations |
40 | |Function delay |[Function delay per minute](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/private_dataset/function_delay_minute.zip) |[Function delay per second](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/private_dataset/function_delay_second.zip) | Function execution time averaged over all pods running that function |
41 | |Platform delay |[Platform delay per minute](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/private_dataset/platform_delay_minute.zip) |[Platform delay per second](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/private_dataset/platform_delay_second.zip )| Platform delay is scheduling time and some network overheads; averaged over all pods running that function |
42 | |CPU usage |[CPU usage per minute](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/private_dataset/cpu_usage_minute.zip) |N/A |Percentage of allocated CPU used per function averaged over all pods |
43 | |Memory usage |[Memory usage per minute](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/private_dataset/memory_usage_minute.zip) |N/A | Percentage of allocated memory used per function averaged over all pods |
44 | |CPU limit |[CPU limit per minute](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/private_dataset/cpu_limit_minute.zip) |N/A | Allocated CPU for all pods running that function (normalized)|
45 | |Memory limit |[Memory limit per minute](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/private_dataset/memory_limit_minute.zip) |N/A | Allocated memory for all pods running that function (MB)
46 | |Instances |[Instances per minute](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/private_dataset/instances_minute.zip) |N/A | Number of pods allocated to that function |
47 |
48 | Note: For Huawei Private, requests, function delay, and platform delay are originally expressed per second. We provide aggregated per-minute versions of these metrics for convenience. Requests per minute are obtained by summing requests per second every 60 seconds. Function and platform delay per minute are obtained by taking the mean every 60 seconds.
49 |
50 |
51 | ### Huawei Public
52 |
53 | This dataset contains 26 days for 5093 functions from one availability zone of our public cloud.
54 |
55 | |Metric |Minute |Description |
56 | |---------------|---------------|-----------------|
57 | |Function ID | - |Unique function identifier out of 5093 (0-5092) |
58 | |Timestamp | - | Timestamp in seconds (0-2246340) |
59 | |Requests |[Requests per minute](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/public_dataset/public_dataset.zip)| Number of function invocations |
60 |
61 |
62 |
63 | After downloading the data, the folder structure should look like this.
64 | ```console
65 | .
66 | ├── demo_private.ipynb
67 | ├── demo_public.ipynb
68 | └── datasets
69 | ├── private_dataset
70 | │ ├── cpu_limit_minute
71 | │ │ ├── day_000.csv
72 | │ │ ├── day_001.csv
73 | │ │ ├── ...
74 | │ │ ├── day_233.csv
75 | │ │ └── day_234.csv
76 | │ ├── cpu_usage_minute
77 | │ │ ├── day_000.csv
78 | │ │ ├── day_001.csv
79 | │ │ ├── ...
80 | │ │ ├── day_233.csv
81 | │ │ └── day_234.csv
82 | │ ├── function_delay_minute
83 | │ │ ├── day_000.csv
84 | │ │ ├── day_001.csv
85 | │ │ ├── ...
86 | │ │ ├── day_233.csv
87 | │ │ └── day_234.csv
88 | │ ├── function_delay_second
89 | │ │ ├── day_000.csv
90 | │ │ ├── day_001.csv
91 | │ │ ├── ...
92 | │ │ ├── day_233.csv
93 | │ │ └── day_234.csv
94 | │ ├── instances_minute
95 | │ │ ├── day_000.csv
96 | │ │ ├── day_001.csv
97 | │ │ ├── ...
98 | │ │ ├── day_233.csv
99 | │ │ └── day_234.csv
100 | │ ├── memory_limit_minute
101 | │ │ ├── day_000.csv
102 | │ │ ├── day_001.csv
103 | │ │ ├── ...
104 | │ │ ├── day_233.csv
105 | │ │ └── day_234.csv
106 | │ ├── memory_usage_minute
107 | │ │ ├── day_000.csv
108 | │ │ ├── day_001.csv
109 | │ │ ├── ...
110 | │ │ ├── day_233.csv
111 | │ │ └── day_234.csv
112 | │ ├── platform_delay_minute
113 | │ │ ├── day_000.csv
114 | │ │ ├── day_001.csv
115 | │ │ ├── ...
116 | │ │ ├── day_233.csv
117 | │ │ └── day_234.csv
118 | │ ├── platform_delay_second
119 | │ │ ├── day_000.csv
120 | │ │ ├── day_001.csv
121 | │ │ ├── ...
122 | │ │ ├── day_233.csv
123 | │ │ └── day_234.csv
124 | │ ├── requests_minute
125 | │ │ ├── day_000.csv
126 | │ │ ├── day_001.csv
127 | │ │ ├── ...
128 | │ │ ├── day_233.csv
129 | │ │ └── day_234.csv
130 | │ └── requests_second
131 | │ ├── day_000.csv
132 | │ ├── day_001.csv
133 | │ ├── ...
134 | │ ├── day_233.csv
135 | │ └── day_234.csv
136 | └── public_dataset
137 | └── requests_minute
138 | ├── day_00.csv
139 | ├── day_01.csv
140 | ├── ...
141 | ├── day_24.csv
142 | └── day_25.csv
143 | ```
144 |
--------------------------------------------------------------------------------
/README_data_release_2025.md:
--------------------------------------------------------------------------------
1 | Huawei Public Cloud Trace 2025
2 |
3 |
4 | This is the description of the Huawei Public Cloud Trace 2025. The data is described in our paper, ***Serverless Cold Starts and Where to Find Them*** (EuroSys 2025).
5 |
6 | We provide a schema of the dataset as well as several notebooks and scripts to show how to load the data and make plots. We also provide the links to download the dataset.
7 |
8 | # Table of Contents
9 | - [Table of Contents](#table-of-contents)
10 | - [How to download the data](#how-to-download-the-data)
11 | - [Code](#code)
12 | - [Huawei Public cold starts](#huawei-public-cold-starts)
13 | - [Schema](#schema)
14 | - [Huawei Public request tables](#huawei-public-request-tables)
15 | - [Schema](#schema-1)
16 | - [Huawei Public trigger types and runtime languages (2025)](#huawei-public-trigger-types-and-runtime-languages-2025)
17 | - [Schema](#schema-2)
18 | - [Huawei Public time series (2025)](#huawei-public-time-series-2025)
19 | - [Schema](#schema-3)
20 |
21 | # How to download the data
22 |
23 | The datasets used in our paper can be downloaded at the links below.
24 |
25 | In some cases, you may not need all files in a zip folder. In this case, use 7zip to drag and drop the desired files or directories without extracting the entire archive.
26 |
27 | ## Code
28 |
29 | Use the Huawei Public cold starts 2025 notebook to get started analyzing the data.
30 |
31 | ## Huawei Public cold starts
32 |
33 | These files contain every cold start analyzed in our 2025 paper.
34 |
35 | **Duration:** 31 days
36 |
37 | |Metric/link |Description |
38 | |--------------------|---------------------|
39 | |[Region 1 cold starts](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/cold_start/R1.zip)|Region 1 cold start events|
40 | |[Region 2 cold starts](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/cold_start/R2.zip)|Region 2 cold start events|
41 | |[Region 3 cold starts](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/cold_start/R3.zip)|Region 3 cold start events|
42 | |[Region 4 cold starts](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/cold_start/R4.zip)|Region 4 cold start events|
43 | |[Region 5 cold starts](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/cold_start/R5.zip)|Region 5 cold start events|
44 |
45 | ### Schema
46 | This table contains individual cold start events. There are different steps of a cold start, including pod allocation, time to deploy code, time to deploy dependencies, and scheduling and other overheads.
47 |
48 | | Name | Log | Description | Unit | Example |
49 | |-------|-----|-------------|---------|----------|
50 | | day | cold start | day | date | 0-30 |
51 | | time | cold start | cold start timestamp | seconds | 51937.425 |
52 | | clusterName | cold start | cluster processing | int | can be 1, 2, 3, 4 |
53 | | funcName| cold start | function name | int | 10 |
54 | | userID | cold start | user ID | int | 134 |
55 | | requestID | cold start | unique request identifier of the request that triggered the scaling decision. Note that one request may trigger multiple cold start events. | - | 82f1aa192e8ce27...|
56 | | totalCost_cold_start | cold start | total time spent on cold start. This should be used as the cold start time | seconds | 2.239314 |
57 | | podAllocationCost | cold start | the time taken to start a pod if no free pods exist or to select a pod from the existing pool to be used by the newly started function | seconds | 0.027161815 |
58 | | deployCodeCost | cold start | time to download, extract, and deploy function code | seconds | 0.045 |
59 | | deployDependencyCost | cold start | time to fetch and load dependencies | seconds | 0.454 |
60 | | schedulingCost | cold start | time for networking, routing, and scheduling overheads | seconds | 1.713152185 |
61 | | podID | cold start | pod ID, contains the pool name (pool24-600-512) which contains pods with CPU limit 600 millicores and memory limit 512MB. | - | pool24-600-512-0000577863 |
62 |
63 | ## Huawei Public request tables
64 |
65 | This table contains event-level logs of individual requests for day 30 of our dataset. Note that for Region 1, we release the top 100 pods per function. For Region 3, we also release a subsample of the data analyzed in the paper.
66 |
67 | **Duration:** 1 day
68 |
69 | |Metric |Download link(s) |
70 | |--------------------|---------------------|
71 | |Region 1 requests| [Region 1 part 1](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00000_00009.zip), [Region 1 part 2](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00010_00019.zip), [Region 1 part 3](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00020_00029.zip), [Region 1 part 4](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00030_00039.zip), [Region 1 part 5](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00040_00049.zip), [Region 1 part 6](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00050_00059.zip), [Region 1 part 7](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00060_00069.zip), [Region 1 part 8](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00070_00079.zip), [Region 1 part 9](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00080_00089.zip), [Region 1 part 10](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00090_00099.zip), [Region 1 part 11](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00100_00109.zip), [Region 1 part 12](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00110_00119.zip), [Region 1 part 13](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00120_00129.zip), [Region 1 part 14](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00130_00139.zip), [Region 1 part 15](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00140_00149.zip), [Region 1 part 16](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00150_00159.zip), [Region 1 part 17](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00160_00169.zip), [Region 1 part 18](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00170_00179.zip), [Region 1 part 19](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00180_00189.zip), [Region 1 part 20](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R1/R1_00190_00199.zip) |
72 | |Region 2 requests| [Region 2 part 1](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R2/R2_00000_00019.zip), [Region 2 part 2](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R2/R2_00020_00039.zip), [Region 2 part 3](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R2/R2_00040_00059.zip), [Region 2 part 4](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R2/R2_00060_00079.zip), [Region 2 part 5](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R2/R2_00080_00099.zip), [Region 2 part 6](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R2/R2_00100_00119.zip), [Region 2 part 7](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R2/R2_00120_00139.zip), [Region 2 part 8](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R2/R2_00140_00159.zip), [Region 2 part 9](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R2/R2_00160_00179.zip), [Region 2 part 10](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R2/R2_00180_00199.zip)
73 | |Region 3 requests| [Region 3 part 1](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R3/R3_00000_00039.zip), [Region 3 part 2](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R3/R3_00040_00079.zip), [Region 3 part 3](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R3/R3_00080_00119.zip), [Region 3 part 4](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R3/R3_00120_00159.zip), [Region 3 part 5](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R3/R3_00160_00199.zip) |
74 | |Region 4 requests| [Region 4 part 1](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R4/R4_00000_00099.zip), [Region 4 part 2](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R4/R4_00100_00199.zip)
75 | |Region 5 requests|[Region 5](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/per_request/R5/R5.zip)|
76 |
77 |
78 | ### Schema
79 |
80 | Huawei's serverless platform produces logs in several different components. This dataset includes data from three main types of logs: frontend, worker, and cold start.
81 |
82 | This section briefly describes the architecture of our system, shown in the diagram below. When a user sends a request, it arrives at the frontend. The frontend sends the requests to a VM, which receives it using the busProxy. The request is then handled by a function worker pod. The function worker pod contains a runtime for the function to run, and a worker component to monitor and interface with the runtime.
83 |
84 | 
85 |
86 | The timestamps have been normalized to start at 0 on day 0, so 2600186.994 means 2600186 seconds and 994 milliseconds after the start of the dataset.
87 |
88 | **Resource usage** The CPU usage of a request is the CPU usage of the entire pod, averaged over the execution time of that request. Similarly, the memory usage of a request is the memory usage of the entire pod, averaged over the execution time of that request.
89 |
90 |
91 | | Name | Log | Description | Unit | Example |
92 | |-------------------|----------|------------------------|--------|------------------------|
93 | | time_worker | worker | worker timestamp | seconds| 2600186.994 |
94 | | time_frontend | frontend | frontend timestamp | seconds| 2600186.995 |
95 | | requestID | worker | unique request identifier | hash | 763ff09a... |
96 | | clusterName | worker | cluster starting pod | int | can be 1, 2, 3, 4 |
97 | | funcName | worker | function name | int | 296 |
98 | | podID | worker | pod ID, contains the pool name (pool22-300-128) which contains pods with CPU limit 300 millicores and memory limit 128MB. | - | pool22-300-128-0000618412 |
99 | | userID | worker | unique user ID | int | 224 |
100 | | totalCost_worker | worker | total time spent in function worker pod | seconds| 0.023749 |
101 | | workerCost | worker | time spent in worker | seconds| 0.000112 |
102 | | runtimeCost | worker | time spent in runtime | seconds| 0.023637 |
103 | | totalCost_frontend| frontend | end-to-end time as measured by frontend | seconds| 0.026421 |
104 | | frontendCost | frontend | time spent monitoring and forwarding in frontend | seconds| 0.000364 |
105 | | busCost | frontend | time spent in busProxy component | seconds| 0.001994 |
106 | | readBodyCost | frontend | time spent reading request body | seconds| 0.000016 |
107 | | writeRspCost | frontend | time spent writing response | seconds| 0.000006 |
108 | | cpu_usage | worker | CPU usage | cores | 0.291506 |
109 | | memory_usage | worker | memory usage | MB | 32.308594 |
110 | | requestBodySize | frontend | size of the request body | Bytes | 1018 |
111 |
112 |
113 | ## Huawei Public trigger types and runtime languages (2025)
114 |
115 | This file contains the runtime languages, trigger types, and CPU request for each function in Region 2.
116 |
117 | **Duration:** static (valid for 31 days)
118 |
119 |
120 | |Metric/link |Description |
121 | |--------------------|---------------------|
122 | |[Region 2 runtimes and trigger types](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/runtime_triggerType/df_funcID_runtime_triggerType.csv)|Runtime languages and trigger types per function in Region 2|
123 |
124 | ### Schema
125 |
126 | This table contains the cpu_request, runtime language, and triggerType-invocationType of each function in Region 2. A unique function is identified by funcID, which is a combination of funcName, userID, and poolName.
127 |
128 | **CPU request and CPU limit** In our system, the user can specify a CPU limit, e.g. 300 millicores. This is the maximum amount of CPU that pods for this function can use. However, the amount of CPU initially requested tends to be lower, which we call cpu\_request. The exact value of cpu_request for different resource configurations may vary, and we provide common values from one region, but they can be used for other regions as well.
129 |
130 | | Name | Log | Description | Unit | Example |
131 | |------|-----|-------------|------|---------|
132 | | funcID | - | funcName---userID---poolName | - | 400---418---pool22-300-128 |
133 | | cpu_request | - | Amount of CPU requested | millicores | 100 |
134 | | runtime | - | Runtime language | - | Python3 |
135 | | triggerType-invocationType | - | List of trigger types of function, as well as the invocation type (synchronous/asynchronous). Refer to the paper for more details. | - | workflow-S |
136 |
137 |
138 |
139 | ## Huawei Public time series (2025)
140 |
141 | These files contain the time series of different metrics on our platform per function, including quantiles of totalCost (function execution time) or cold start time. We also include the number of requests, number of pods, and number of cold starts.
142 |
143 | **Duration:** 31 days
144 |
145 | |Metric/link |Description |
146 | |--------------------|---------------------|
147 | |[Region 1 quantiles](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/quantiles/R1.zip)|Region 1 quantiles|
148 | |[Region 2 quantiles](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/quantiles/R2.zip)|Region 2 quantiles|
149 | |[Region 3 quantiles](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/quantiles/R3.zip)|Region 3 quantiles|
150 | |[Region 4 quantiles](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/quantiles/R4.zip)|Region 4 quantiles|
151 | |[Region 5 quantiles](https://sir-dataset.obs.cn-east-3.myhuaweicloud.com/datasets/cold_start_dataset/quantiles/R5.zip)|Region 5 quantiles|
152 |
153 | ### Schema
154 |
155 | In addition to the event-based tables above, we provide files that aggregate different metrics per minute as a time series per funcID. The funcID uniquely identifies a function, which is a combination of funcName, userID, and poolName, e.g. 400---418---pool22-300-128 (funcName=400, userID=418, poolName=pool22-300-128). Refer to the example notebooks for more information about how to use these files.
156 |
157 | We provide the following summary statistics per function per minute: avg, std, and 0, 0.25, 0.50, 0.75, 0.90, 0.95, 0.99 1.00 quantiles.
158 |
159 | These statistics are computed for most of the metrics seen in the event-based tables, as shown in the table below.
160 |
161 | | Name | Log | Description |
162 | |-----------------------|------------|-----------------------------------------------------|
163 | | busCost | frontend | See request table schema. |
164 | | cpu_usage | worker | See request table schema. |
165 | | deployCodeCost | cold start | See cold start table schema. |
166 | | deployDependencyCost | cold start | See cold start table schema. |
167 | | frontendCost | frontend | See request table schema. |
168 | | memory_usage | worker | See request table schema. |
169 | | num_cold_starts | cold start | Number of cold starts per function. See cold start table schema. |
170 | | num_pods | worker | Number of running pods per function. See request table schema. |
171 | | podAllocationCost | cold start | See cold start table schema. |
172 | | readBodyCost | frontend | See request table schema. |
173 | | requestBodySize | frontend | See request table schema. |
174 | | requests | worker | Number of requests per function. See request table schema. |
175 | | runtimeCost | worker | See request table schema. |
176 | | schedulingCost | cold start | See cold start table schema. |
177 | | totalCost | worker | See totalCost_worker in request table schema. |
178 | | totalCost_cold_start | cold start | See cold start table schema. |
179 | | totalCost_frontend | frontend | See request table schema. |
180 | | workerCost | worker | See request table schema. |
181 | | writeRspCost | frontend | See request table schema. |
182 |
--------------------------------------------------------------------------------
/architecture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sir-lab/data-release/8806bdea7e4fc6d148fef143a58b7abb34372f73/architecture.png
--------------------------------------------------------------------------------
/environment.yml:
--------------------------------------------------------------------------------
1 | name: trace-analysis
2 | channels:
3 | - conda-forge
4 | - defaults
5 | dependencies:
6 | - python=3.10
7 | - anaconda::numpy=1.26.0
8 | - anaconda::pandas=2.2.1
9 | - conda-forge::matplotlib=3.8.2
10 | - conda-forge::seaborn=0.13.2
11 | - conda-forge::dask=2024.3.1
12 | - conda-forge::tqdm=4.66.2
13 | - anaconda::ipykernel
14 | - conda-forge::jupyterlab=4.1.4
15 | - anaconda::notebook
16 |
--------------------------------------------------------------------------------
/papers/Serverless_Cold_Starts_and_Where_to_Find_Them_EuroSys_2025.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sir-lab/data-release/8806bdea7e4fc6d148fef143a58b7abb34372f73/papers/Serverless_Cold_Starts_and_Where_to_Find_Them_EuroSys_2025.pdf
--------------------------------------------------------------------------------
/papers/SoCC_2023_How_does_it_function.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sir-lab/data-release/8806bdea7e4fc6d148fef143a58b7abb34372f73/papers/SoCC_2023_How_does_it_function.pdf
--------------------------------------------------------------------------------