├── CITATION.cff
├── CODE_OF_CONDUCT.md
└── README.md


/CITATION.cff:
--------------------------------------------------------------------------------
 1 | cff-version: 1.2.0
 2 | title: Queueing theory
 3 | message: >-
 4 |   If you use this work and you want to cite it,
 5 |   then you can use the metadata from this file.
 6 | type: software
 7 | authors:
 8 |   - given-names: Joel Parker
 9 |     family-names: Henderson
10 |     email: joel@joelparkerhenderson.com
11 |     affiliation: joelparkerhenderson.com
12 |     orcid: 'https://orcid.org/0009-0000-4681-282X'
13 | identifiers:
14 |   - type: url
15 |     value: 'https://github.com/joelparkerhenderson/queueing-theory/'
16 |     description: Queueing theory
17 | repository-code: 'https://github.com/joelparkerhenderson/queueing-theory/'
18 | abstract: >-
19 |   Queueing theory
20 | license: See license file
21 | 


--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
  1 | 
  2 | # Contributor Covenant Code of Conduct
  3 | 
  4 | ## Our Pledge
  5 | 
  6 | We as members, contributors, and leaders pledge to make participation in our
  7 | community a harassment-free experience for everyone, regardless of age, body
  8 | size, visible or invisible disability, ethnicity, sex characteristics, gender
  9 | identity and expression, level of experience, education, socio-economic status,
 10 | nationality, personal appearance, race, caste, color, religion, or sexual
 11 | identity and orientation.
 12 | 
 13 | We pledge to act and interact in ways that contribute to an open, welcoming,
 14 | diverse, inclusive, and healthy community.
 15 | 
 16 | ## Our Standards
 17 | 
 18 | Examples of behavior that contributes to a positive environment for our
 19 | community include:
 20 | 
 21 | * Demonstrating empathy and kindness toward other people
 22 | * Being respectful of differing opinions, viewpoints, and experiences
 23 | * Giving and gracefully accepting constructive feedback
 24 | * Accepting responsibility and apologizing to those affected by our mistakes,
 25 |   and learning from the experience
 26 | * Focusing on what is best not just for us as individuals, but for the overall
 27 |   community
 28 | 
 29 | Examples of unacceptable behavior include:
 30 | 
 31 | * The use of sexualized language or imagery, and sexual attention or advances of
 32 |   any kind
 33 | * Trolling, insulting or derogatory comments, and personal or political attacks
 34 | * Public or private harassment
 35 | * Publishing others' private information, such as a physical or email address,
 36 |   without their explicit permission
 37 | * Other conduct which could reasonably be considered inappropriate in a
 38 |   professional setting
 39 | 
 40 | ## Enforcement Responsibilities
 41 | 
 42 | Community leaders are responsible for clarifying and enforcing our standards of
 43 | acceptable behavior and will take appropriate and fair corrective action in
 44 | response to any behavior that they deem inappropriate, threatening, offensive,
 45 | or harmful.
 46 | 
 47 | Community leaders have the right and responsibility to remove, edit, or reject
 48 | comments, commits, code, wiki edits, issues, and other contributions that are
 49 | not aligned to this Code of Conduct, and will communicate reasons for moderation
 50 | decisions when appropriate.
 51 | 
 52 | ## Scope
 53 | 
 54 | This Code of Conduct applies within all community spaces, and also applies when
 55 | an individual is officially representing the community in public spaces.
 56 | Examples of representing our community include using an official e-mail address,
 57 | posting via an official social media account, or acting as an appointed
 58 | representative at an online or offline event.
 59 | 
 60 | ## Enforcement
 61 | 
 62 | Instances of abusive, harassing, or otherwise unacceptable behavior may be
 63 | reported to the community leaders responsible for enforcement at
 64 | [INSERT CONTACT METHOD].
 65 | All complaints will be reviewed and investigated promptly and fairly.
 66 | 
 67 | All community leaders are obligated to respect the privacy and security of the
 68 | reporter of any incident.
 69 | 
 70 | ## Enforcement Guidelines
 71 | 
 72 | Community leaders will follow these Community Impact Guidelines in determining
 73 | the consequences for any action they deem in violation of this Code of Conduct:
 74 | 
 75 | ### 1. Correction
 76 | 
 77 | **Community Impact**: Use of inappropriate language or other behavior deemed
 78 | unprofessional or unwelcome in the community.
 79 | 
 80 | **Consequence**: A private, written warning from community leaders, providing
 81 | clarity around the nature of the violation and an explanation of why the
 82 | behavior was inappropriate. A public apology may be requested.
 83 | 
 84 | ### 2. Warning
 85 | 
 86 | **Community Impact**: A violation through a single incident or series of
 87 | actions.
 88 | 
 89 | **Consequence**: A warning with consequences for continued behavior. No
 90 | interaction with the people involved, including unsolicited interaction with
 91 | those enforcing the Code of Conduct, for a specified period of time. This
 92 | includes avoiding interactions in community spaces as well as external channels
 93 | like social media. Violating these terms may lead to a temporary or permanent
 94 | ban.
 95 | 
 96 | ### 3. Temporary Ban
 97 | 
 98 | **Community Impact**: A serious violation of community standards, including
 99 | sustained inappropriate behavior.
100 | 
101 | **Consequence**: A temporary ban from any sort of interaction or public
102 | communication with the community for a specified period of time. No public or
103 | private interaction with the people involved, including unsolicited interaction
104 | with those enforcing the Code of Conduct, is allowed during this period.
105 | Violating these terms may lead to a permanent ban.
106 | 
107 | ### 4. Permanent Ban
108 | 
109 | **Community Impact**: Demonstrating a pattern of violation of community
110 | standards, including sustained inappropriate behavior, harassment of an
111 | individual, or aggression toward or disparagement of classes of individuals.
112 | 
113 | **Consequence**: A permanent ban from any sort of public interaction within the
114 | community.
115 | 
116 | ## Attribution
117 | 
118 | This Code of Conduct is adapted from the [Contributor Covenant][homepage],
119 | version 2.1, available at
120 | [https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
121 | 
122 | Community Impact Guidelines were inspired by
123 | [Mozilla's code of conduct enforcement ladder][Mozilla CoC].
124 | 
125 | For answers to common questions about this code of conduct, see the FAQ at
126 | [https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
127 | [https://www.contributor-covenant.org/translations][translations].
128 | 
129 | [homepage]: https://www.contributor-covenant.org
130 | [v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
131 | [Mozilla CoC]: https://github.com/mozilla/diversity
132 | [FAQ]: https://www.contributor-covenant.org/faq
133 | [translations]: https://www.contributor-covenant.org/translations
134 | 
135 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Queueing theory
  2 | 
  3 | Queueing theory is the mathematical study of waiting lines, or queues. We use queueing theory in our software development, for purposes such as analyzing and optimizing our practices and processes, such as our customer service responsiveness, project management kanban planning, inter-process communication message queues, and devops continuous deployment pipelines.
  4 | 
  5 | Contents:
  6 | 
  7 | * [Introduction](#introduction)
  8 |   * [Customer service responsiveness](#customer-service-responsiveness)
  9 |   * [Project management kanban planning](#project-management-kanban-planning)
 10 |   * [Inter-process communication message queues](#inter-process-communication-message-queues)
 11 |   * [Devops continuous deployment pipelines](#devops-continuous-deployment-pipelines)
 12 | * [Queue terminology](#queue-terminology)
 13 |   * [Queue types and service types](#queue-types-and-service-types)
 14 |   * [Queue dropouts](#queue-dropouts)
 15 | * [Queueing theory notation](#queueing-theory-notation)
 16 |   * [Arrival rate, service rate, dropout rate](#arrival-rate-service-rate-dropout-rate)
 17 |   * [Utilization ratio](#utilization-ratio)
 18 |   * [Error ratio](#error-ratio)
 19 |   * [Lead time, wait time, work time, step time](#lead-time-wait-time-work-time-step-time)
 20 |   * [Count](#count)
 21 |   * [Standard notation](#standard-notation)
 22 | * [Activity tracking](#activity-tracking)
 23 |   * [Activity examples](#activity-examples)
 24 |   * [Little's Law](#little-s-law)
 25 |   * [Key performance indicators (KPIs)](#key-performance-indicators-kpis)
 26 | * [Epilog](#epilog)
 27 |   * [See also](#see-also)
 28 |   * [Thanks](#thanks)
 29 | 
 30 | 
 31 | ## Introduction
 32 | 
 33 | We use queueing theory in our projects for many purposes:
 34 | 
 35 |   * Customer service responsiveness
 36 | 
 37 |   * Project management kanban planning
 38 | 
 39 |   * Inter-process communication message queues
 40 | 
 41 |   * Devops continuous deployment pipelines
 42 | 
 43 | 
 44 | ### Customer service responsiveness
 45 | 
 46 | For example, we want to analyze how customers request sales help and support help, and how fast we respond.
 47 | 
 48 | Some relevant products are e.g. [Salesforce](https://wikipedia.org/wiki/Salesforce), [LiveChat](https://wikipedia.org/wiki/LiveChat), [Zendesk](https://wikipedia.org/wiki/Zendesk).
 49 | 
 50 | 
 51 | ### Project management kanban planning
 52 | 
 53 | For example, we want to track the lead times and progress times as a new feature idea evolves from design to delivery.
 54 | 
 55 | Some relevant products are e.g. [Asana](https://wikipedia.org/wiki/Asana_(software)), [Jira](https://wikipedia.org/wiki/Jira_(software)), [Microsoft Project](https://wikipedia.org/wiki/Microsoft_Project).
 56 | 
 57 | 
 58 | ### Inter-process communication message queues
 59 | 
 60 | For example, we want to maximize throughputs and minimize pressures as one program sends requests to another program.
 61 | 
 62 | Some relevant products are e.g. [RabbitMQ](https://wikipedia.org/wiki/RabbitMQ), [ActiveMQ](https://wikipedia.org/wiki/Apache_ActiveMQ), [ZeroMQ](https://wikipedia.org/wiki/ZeroMQ).
 63 | 
 64 | 
 65 | ### Devops continuous deployment pipelines
 66 | 
 67 | For example, we want to ensure our continuous integration server has capacity to test our software then deploy it.
 68 | 
 69 | Some relevant products are e.g. [Jenkins](https://wikipedia.org/wiki/Jenkins_(software)), [Bamboo](https://wikipedia.org/wiki/Bamboo_(software)), [Azure DevOps](https://wikipedia.org/wiki/Microsoft_Visual_Studio#Azure_DevOps).
 70 | 
 71 | 
 72 | ## Queue terminology
 73 | 
 74 | Queue terminology is a big topic. This section has some of our common terminology. For the examples, we will use the idea of a customer waiting in line.
 75 | 
 76 | 
 77 | ### Queue types and service types
 78 | 
 79 | Queue types and service types describe how the queue chooses which items to process.
 80 | 
 81 |   * First In First Out (FIFO): serve the customer who has been waiting for the longest time.
 82 | 
 83 |   * Last In First Out (LIFO): serve the customer who has been waiting for the shortest time.
 84 | 
 85 |   * Priority: serve customers based on their priority level; these levels could be based on status, urgency, payment, etc.
 86 | 
 87 |   * Shortest Job First (SJF): serve the customer who needs the smallest amount of service.
 88 | 
 89 |   * Longest Job First (LJF): serve the customer who needs the largest amount of service.
 90 | 
 91 |   * Time Sharing: serve everyone at the same time; service capacity is distributed evenly among everyone waiting.
 92 | 
 93 | 
 94 | ### Queue dropouts
 95 | 
 96 | Queue dropouts are when a customer does not make it through the queue.
 97 | 
 98 |   * Balking: when a customer decides not to start waiting for service because the wait time threatens to be too long.
 99 | 
100 |   * Reneging: when a customer who has waited already decides to leave because they’ve wasted too much time.
101 | 
102 |   * Jockeying: when a customer switches between queues in a tandem queue system, trying to get a shorter wait.
103 | 
104 | 
105 | ## Queueing theory notation
106 | 
107 | Queueing theory uses notation with Greek letters.
108 | 
109 | Our teams use some of the popular notation; we also add some custom notion that help us with software projects.
110 | 
111 | 
112 | ### Arrival rate, service rate, dropout rate
113 | 
114 | The most important notation:
115 | 
116 |   * λ: arrival rate. This measures how fast new items are coming into the queue.
117 | 
118 |   * μ: service rate. This measures how fast items in the queue are being handled.
119 | 
120 |   * σ: dropout rate. This measures how fast items are skipping out the queue unhandled.
121 | 
122 | Examples:
123 | 
124 |   * λ = μ means the arrival rate equals the service rate; the queue is staying the same size, other than dropouts.
125 | 
126 |   * λ > μ means the arrival rate is greater than the service rate; the queue is getting larger, other than dropouts.
127 | 
128 |   * λ < μ means the arrival rate is less than the service rate; the queue is getting smaller, other than dropouts.
129 | 
130 | 
131 | ### Utilization ratio
132 | 
133 | The most important notation that summarizes a queue:
134 | 
135 |   * ρ: utilization ratio = λ / μ
136 | 
137 | Examples:
138 | 
139 |   * ρ = 1 means the arrival rate is equal to the service rate; the queue is staying the same size.
140 | 
141 |   * ρ > 1 means the arrival rate is greater than the service rate; the queue is getting larger.
142 | 
143 |   * ρ < 1 means the arrival rate is less than the service rate; the queue is getting smaller.
144 | 
145 | 
146 | ### Error ratio
147 | 
148 | The most important notation that summarizes a queue's success:
149 | 
150 |   * ε: error ratio = service failure count / service total count
151 | 
152 | Examples:
153 | 
154 |   * ε = 0 means no errors.
155 | 
156 |   * ε = 0.1 means 10% of services have an error.
157 | 
158 |   * ε = 1 means every service has an error.
159 | 
160 | 
161 | ### Lead time, wait time, work time, step time
162 | 
163 | We track four times:
164 | 
165 |   * τ: lead time = from arrival to finish
166 | 
167 |   * ω: wait time = from arrival to start of work
168 | 
169 |   * φ: work time = from start of work to finish
170 | 
171 |   * θ: step time = from finish to next finish
172 | 
173 | Examples:
174 | 
175 |   * τ = 5s means an item is added to the queue, then serviced 5 seconds later.
176 | 
177 |   * ω = 4s means an item waits in the queue for 4 seconds, then work starts.
178 | 
179 |   * φ = 1s means an item takes 1 second of work, then is complete.
180 | 
181 |   * θ = 1s means there's 1 second between one completion and the next completion.
182 | 
183 | 
184 | ### Count
185 | 
186 | We count items often, and we use this notation:
187 | 
188 |   * κ: count
189 | 
190 | Example:
191 | 
192 |   * κ = 100 means there are 100 items.
193 | 
194 |   * κ > 100 means there are more than 100 items.
195 | 
196 |   * κ ≫ 100 means there are many more than 100 items.
197 | 
198 | 
199 | 
200 | ### Standard notation
201 | 
202 | Standard notation for queueing theory also uses these symbols:
203 | 
204 |   * n: the number of items in the system.
205 | 
206 |   * A: the arrival process probability distribution.
207 | 
208 |   * B: the service process probability distribution.
209 | 
210 |   * C: the number of servers.
211 | 
212 |   * D: the maximum number of items allowed in the queue at any given time, waiting or being served (without getting bumped).
213 | 
214 |   * E: the maximum number of items total.
215 | 
216 | 
217 | ## Activity tracking
218 | 
219 | 
220 | ### Activity examples
221 | 
222 | Suppose we have something we want to track, and we call it something generic such as "Activity" and abbrievated as "A".
223 | 
224 | We can efficiently use queuing notation to describe the activity and how it moves through a queue.
225 | 
226 | Examples:
227 | 
228 |   * Aκ: Activity count: how many items are in the queue.
229 | 
230 |   * Aλ: Activity arrival rate: how many items are incoming per time unit.
231 | 
232 |   * Aμ: Activity service rate: how many items are completed per time unit.
233 | 
234 |   * Aσ: Activity dropout rate: how many items are abandoned per time unit.
235 | 
236 |   * Aρ: Activity utilization ratio: how many items are arriving vs. completing.
237 | 
238 |   * Aε: Activity error ratio: how many items are completed with errors vs. total.
239 | 
240 |   * Aτ: Activity lead time: how much time elapses from requested to completed.
241 | 
242 |   * Aω: Activity wait time: how much time elapses from requested to started.
243 | 
244 |   * Aφ: Activity work time: how much time elapses from started to completed.
245 | 
246 |   * Aθ: Activity step time: how much time elapses from completed to next completed.
247 | 
248 | 
249 | ### Little's Law
250 | 
251 | Little's law is a theorem by John Little which states: the long-term average number L of customers in a stationary system is equal to the long-term average effective arrival rate λ multiplied by the average time W that a customer spends in the system.
252 | 
253 | Example notation:
254 | 
255 |   * L is the long-term average number of customers in the system.
256 | 
257 |   * λ is the long-term average effective arrival rate.
258 | 
259 |   * W is the average time that a customer spends in the system.
260 | 
261 |   * L = λ W is Little's law.
262 | 
263 | Little's law assumptions:
264 | 
265 |   * All measurement units are consistent.
266 | 
267 |   * Conservation of flow, meaning the average arrival rate equals the average departure rate.
268 | 
269 |   * All work that enters the system then flows through to completion.
270 | 
271 |   * The system is “stable”, meaning the average age of items are neither increasing or decreasing, and the total number of items is roughly the same at the beginning and at the end.
272 | 
273 | 
274 | ### Key performance indicators (KPIs)
275 | 
276 | We typically track many things about the activities in the queue, and we want to summarize the results by choosing a shortlist of the most relevant ones for our projects.
277 | 
278 | We have built many projects, and we believe the most valuable summary indicators are:
279 | 
280 |   * Dτ = Delivery lead time. Product teams may say "from concept to customer" or "from idea to implementation".
281 | 
282 |   * Dμ = Delivery service rate. Devops teams may say "deployment frequency" or "we ship X times per day".
283 | 
284 |   * Dε = Delivery error ratio. Quality teams may say "change fail rate" or "percentage of rollbacks".
285 | 
286 |   * Rτ = Restore lead time. Site reliability engineers may say "time to restore service" or "mean time to restore (MTTR)".
287 | 
288 | 
289 | 
290 | ## Epilog
291 | 
292 | 
293 | ### See also
294 | 
295 | Wikipedia:
296 | 
297 |   * [Queueing theory](https://en.wikipedia.org/wiki/Queueing_theory)
298 | 
299 |   * [M/M/1 queue](https://en.wikipedia.org/wiki/M/M/1_queue)
300 | 
301 |   * [Little's law](https://en.wikipedia.org/wiki/Little%27s_law)
302 | 
303 |   * [Markov chain](https://en.wikipedia.org/wiki/Markov_chain)
304 | 
305 | Wikipedia areas where we use queues in many projects:
306 | 
307 |   * [Project management](https://en.wikipedia.org/wiki/Project_management)
308 | 
309 |   * [Message queue](https://en.wikipedia.org/wiki/Message_queue)
310 | 
311 |   * [DevOps](https://en.wikipedia.org/wiki/DevOps)
312 | 
313 | Introductions by John D. Cook:
314 | 
315 |   * [The science of waiting in line](https://www.johndcook.com/blog/2019/01/23/queueing/)
316 | 
317 |   * [Server utilization: Joel on queuing](https://www.johndcook.com/blog/2009/01/30/server-utilization-joel-on-queuing/)
318 | 
319 |   * [What happens when you add a new teller?](https://www.johndcook.com/blog/2008/10/21/what-happens-when-you-add-a-new-teller/)
320 | 
321 | Introductions with more detail:
322 | 
323 |   * [Queuing Theory: Simple Definition, Notation and Terminology](https://www.statisticshowto.datasciencecentral.com/queuing-theory/)
324 | 
325 |   * [The most important thing to understand about queues - By Dan Slimmon](https://blog.danslimmon.com/2016/08/26/the-most-important-thing-to-understand-about-queues/)
326 | 
327 |   * [Operations Research - Notes. By J E Beasley](http://people.brunel.ac.uk/~mastjjb/jeb/or/queue.html)
328 | 
329 |   * [Little’s Law – the basis of Lean and Kanban](http://itsadeliverything.com/littles-law-the-basis-of-lean-and-kanban)
330 | 
331 |   * [Investopedia: Queueing theory](https://www.investopedia.com/terms/q/queuing-theory.asp)
332 | 
333 | Blog posts:
334 | 
335 |   * [It's time for some queueing theory - By Kottke](https://kottke.org/19/01/its-time-for-some-queueing-theory)
336 | 
337 | 
338 | [Seven Insights Into Queueing Theory](http://www.treewhimsy.com/TECPB/Articles/SevenInsights.pdf):
339 | 
340 |   * The slower the service center, the lower the maximum utilization you should plan for at peak load. 
341 | 
342 |   * It’s very hard to use the last 15% of anything.
343 | 
344 |   * The closer you are to the edge, the higher the price for being wrong.
345 | 
346 |   * Response time increases are limited by the number that can wait.
347 | 
348 |   * Remember this is an average, not a maximum.
349 | 
350 |   * There is a human denial effect in multiple service centers.
351 | 
352 |   * Show small improvements in their best light.
353 | 
354 | 
355 | ### Thanks
356 | 
357 | [Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations. By Nicole Forsgren, Jez Humble, Gene Kim](https://www.amazon.com/dp/B07B9F83WM). This book is excellent for high level devops, and directly informs our choice of KPIs. The KPIs on this page align with the book's recommendations.
358 | 


--------------------------------------------------------------------------------