├── .gitignore
├── README.md
└── tasks.py


/.gitignore:
--------------------------------------------------------------------------------
1 | .idea
2 | 
3 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
   1 | # Notes on Software Systems Engineering
   2 | 
   3 | These notes have been collected over time during my work as a software engineer.
   4 | 
   5 | Most of these notes are my own, though occasionally I quote from great books and
   6 | other resources. Notes with quotations always include references to their
   7 | sources.
   8 | 
   9 | These notes should not be seen as strict instructions that must be followed, but
  10 | rather as soft guidelines or recommendations. They are most effective when
  11 | considered altogether. Taken in isolation, some may even contradict one another.
  12 | Trying to follow these notes too rigidly can diminish their value or even lead
  13 | to negative outcomes. There may also be some overlap between the notes, so it's
  14 | important not to take them too literally.
  15 | 
  16 | This is currently just a draft which is far from complete and organized
  17 | arbitrarily. Please don't expect it to be polished.
  18 | 
  19 | <!-- START doctoc generated TOC please keep comment here to allow auto update -->
  20 | <!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
  21 | 
  22 | - [Day-to-Day Work of a Software Engineer](#day-to-day-work-of-a-software-engineer)
  23 |   - [Leave Work Better: Improving Today for a Simpler Tomorrow](#leave-work-better-improving-today-for-a-simpler-tomorrow)
  24 |   - [Fast Feedback](#fast-feedback)
  25 |   - [Start Simple](#start-simple)
  26 |   - [Look Outside Your Immediate Task, Maintain the Bigger Picture](#look-outside-your-immediate-task-maintain-the-bigger-picture)
  27 |   - [Avoid Work That Can Be Avoided](#avoid-work-that-can-be-avoided)
  28 |   - [Understand and Respect the Customer](#understand-and-respect-the-customer)
  29 |   - [Choose Where to Innovate (Carefully)](#choose-where-to-innovate-carefully)
  30 |   - [Automate everything](#automate-everything)
  31 |   - [Quick exploration](#quick-exploration)
  32 |   - [Task Sequencing: Group Related Activities for Efficiency](#task-sequencing-group-related-activities-for-efficiency)
  33 |   - [Strive for Clarity](#strive-for-clarity)
  34 |   - [Everything Explicit. No Magic.](#everything-explicit-no-magic)
  35 |   - [Close the loops, acknowledge communication](#close-the-loops-acknowledge-communication)
  36 |   - [Learn from Lessons](#learn-from-lessons)
  37 |   - [Use Diagrams](#use-diagrams)
  38 | - [Communication and Teamwork](#communication-and-teamwork)
  39 |   - [Agile Software Development Requires Strong Social Network](#agile-software-development-requires-strong-social-network)
  40 |   - [Sending Status Updates to the Team](#sending-status-updates-to-the-team)
  41 |   - [Keep Everyone in the Loop](#keep-everyone-in-the-loop)
  42 |   - [Recognize the ideas and achievements of your colleagues](#recognize-the-ideas-and-achievements-of-your-colleagues)
  43 |   - [Professional content](#professional-content)
  44 |   - [Loop in Experts for Important Actions](#loop-in-experts-for-important-actions)
  45 | - [Complexity and Cognitive Load](#complexity-and-cognitive-load)
  46 |   - [Solving Right Problems](#solving-right-problems)
  47 |   - [Solutions are Context-Driven](#solutions-are-context-driven)
  48 |   - [Weakest link](#weakest-link)
  49 |   - [Point of View](#point-of-view)
  50 |   - [Periphery](#periphery)
  51 |   - [Rational and Unconscious](#rational-and-unconscious)
  52 |   - [Humans are not designed for Big Numbers](#humans-are-not-designed-for-big-numbers)
  53 |   - [There is no such thing as Many](#there-is-no-such-thing-as-many)
  54 |   - [0-1-2-Many I](#0-1-2-many-i)
  55 |   - [0-1-2-Many II](#0-1-2-many-ii)
  56 |   - [Masking (Shadowing)](#masking-shadowing)
  57 | - [Design](#design)
  58 |   - [Poor Abstraction](#poor-abstraction)
  59 |   - [Cost of Abstraction](#cost-of-abstraction)
  60 |   - [Habitability](#habitability)
  61 |   - [Hard Feature](#hard-feature)
  62 |   - [True Name](#true-name)
  63 |   - [One Pattern per Class](#one-pattern-per-class)
  64 |   - [Archetype](#archetype)
  65 |   - [Prima Materia](#prima-materia)
  66 |   - [Mature automation](#mature-automation)
  67 |   - ["Magic" is automation that is not adequate](#magic-is-automation-that-is-not-adequate)
  68 |   - [Poisonous Systems](#poisonous-systems)
  69 |   - [Bad Design in House](#bad-design-in-house)
  70 |   - [Trade-off of Encapsulation](#trade-off-of-encapsulation)
  71 |   - [Unnecessary Flexibility](#unnecessary-flexibility)
  72 |   - [Black Box with a Green Play Button](#black-box-with-a-green-play-button)
  73 |   - [Single Source Concept and Its Exceptions](#single-source-concept-and-its-exceptions)
  74 |   - [Resilience to Change vs Fixed Perfect Solutions](#resilience-to-change-vs-fixed-perfect-solutions)
  75 |   - [Two Almost Identical Entities](#two-almost-identical-entities)
  76 |   - [Control](#control)
  77 |     - [Observable Control](#observable-control)
  78 |     - [Humans should dominate machines](#humans-should-dominate-machines)
  79 |     - [Overlapping control](#overlapping-control)
  80 |     - [Broken control loops](#broken-control-loops)
  81 |   - [Feedback](#feedback)
  82 |     - [Broken feedback loops](#broken-feedback-loops)
  83 |   - [Separation / partitioning](#separation--partitioning)
  84 |   - [Grouping](#grouping)
  85 |   - [Observability vs Correctness](#observability-vs-correctness)
  86 |   - [Don't Use RAII on a Business Logic Level](#dont-use-raii-on-a-business-logic-level)
  87 | - [Coding, code reviews, and maintenance programming](#coding-code-reviews-and-maintenance-programming)
  88 |   - [Code that Works](#code-that-works)
  89 |   - [Code Is Not Your Partner](#code-is-not-your-partner)
  90 |   - [Two Strategies for Replacing a Feature](#two-strategies-for-replacing-a-feature)
  91 |   - [Smallest Scope](#smallest-scope)
  92 |   - [Code Style as a Blocker](#code-style-as-a-blocker)
  93 |   - [Simplifying Complex Feature Branches](#simplifying-complex-feature-branches)
  94 |   - [The Moving and Changing Anti-pattern](#the-moving-and-changing-anti-pattern)
  95 |   - [Avoid Plural Names For Classes](#avoid-plural-names-for-classes)
  96 |   - [Fast Programming and Slow Programming](#fast-programming-and-slow-programming)
  97 |   - [Stable Components](#stable-components)
  98 |   - [Boring Code](#boring-code)
  99 |   - [Boring Code 2](#boring-code-2)
 100 |   - [Lack of Knowledge](#lack-of-knowledge)
 101 |   - [Lack of Knowledge II](#lack-of-knowledge-ii)
 102 |   - [Goodwill vs Pain](#goodwill-vs-pain)
 103 | - [Biases](#biases)
 104 |   - [If It Works, Then It Works Bias](#if-it-works-then-it-works-bias)
 105 |   - [Focusing only on what's most visible bias](#focusing-only-on-whats-most-visible-bias)
 106 |   - [The Fix Bias](#the-fix-bias)
 107 |   - [Resolving Merge Conflict Bias](#resolving-merge-conflict-bias)
 108 | - [Reliability](#reliability)
 109 |   - [Errors are not ok](#errors-are-not-ok)
 110 |   - [Errors must be understood and described](#errors-must-be-understood-and-described)
 111 |   - [Underlying errors shall not be hidden](#underlying-errors-shall-not-be-hidden)
 112 |   - [Critical errors vs non-critical errors](#critical-errors-vs-non-critical-errors)
 113 |   - [Assertions are better than no error handling](#assertions-are-better-than-no-error-handling)
 114 |   - [Assertions are shortcuts for a proper error handling](#assertions-are-shortcuts-for-a-proper-error-handling)
 115 |   - [Crash Early](#crash-early)
 116 | - [Testing](#testing)
 117 |   - [Write Tests, Even Bad Ones](#write-tests-even-bad-ones)
 118 |   - [TDD as a Toolbox](#tdd-as-a-toolbox)
 119 |   - [Legacy Code is Code Without Tests](#legacy-code-is-code-without-tests)
 120 |   - [Testing as a Way to Manage Complexity](#testing-as-a-way-to-manage-complexity)
 121 |   - [Test It to Engineer It](#test-it-to-engineer-it)
 122 |   - [Improve Testability](#improve-testability)
 123 | - [Distribution](#distribution)
 124 |   - [Provide Basic Test Sequences with Your Product](#provide-basic-test-sequences-with-your-product)
 125 |   - [Provide Drivers Alongside Your Hardware](#provide-drivers-alongside-your-hardware)
 126 |   - [Provide Simulators Alongside Your Hardware](#provide-simulators-alongside-your-hardware)
 127 | - [Documentation](#documentation)
 128 |   - [The Illusion of Easy Documentation](#the-illusion-of-easy-documentation)
 129 |   - [Less prose, more structure](#less-prose-more-structure)
 130 |   - [Too Much Structure Overload](#too-much-structure-overload)
 131 |   - [Encyclopedic Document](#encyclopedic-document)
 132 | - [Meetings](#meetings)
 133 |   - [Sound Check](#sound-check)
 134 |   - [Meeting Agenda](#meeting-agenda)
 135 |   - [Meeting Notes](#meeting-notes)
 136 |   - [Capturing Meeting Results](#capturing-meeting-results)
 137 |   - [Briefing In](#briefing-in)
 138 |   - [Sharing Screen & Presenting Material](#sharing-screen--presenting-material)
 139 | - [Systems](#systems)
 140 |   - [Good enough is often best](#good-enough-is-often-best)
 141 |   - [Designing Systems for Effective Work](#designing-systems-for-effective-work)
 142 |   - [The Risk of Default Outcomes](#the-risk-of-default-outcomes)
 143 | - [People and Organizations](#people-and-organizations)
 144 |   - [Everyone is busy](#everyone-is-busy)
 145 |   - [Solving Problems with Cash](#solving-problems-with-cash)
 146 |   - [The Paradox of Rushing in Software/Systems Engineering](#the-paradox-of-rushing-in-softwaresystems-engineering)
 147 |   - [Four seasons](#four-seasons)
 148 | - [Standards](#standards)
 149 |   - [Idealized standards vs. practical implementation](#idealized-standards-vs-practical-implementation)
 150 |   - [The challenge of standards implementation](#the-challenge-of-standards-implementation)
 151 |   - [Standards and best practices](#standards-and-best-practices)
 152 |   - [Standards favor good practice](#standards-favor-good-practice)
 153 |   - [Wrong is worse than early or incomplete](#wrong-is-worse-than-early-or-incomplete)
 154 | - [Requirements](#requirements)
 155 |   - [One-stop shopping](#one-stop-shopping)
 156 | - [Safety](#safety)
 157 |   - [Safety does not exist without blood, loss or failure](#safety-does-not-exist-without-blood-loss-or-failure)
 158 |   - [Safety is boring](#safety-is-boring)
 159 |   - [Safety is very hard to achieve but is very easy to lose](#safety-is-very-hard-to-achieve-but-is-very-easy-to-lose)
 160 |   - [Success breeds failure](#success-breeds-failure)
 161 |   - [Safety as a Defensive Discipline](#safety-as-a-defensive-discipline)
 162 |   - [Safety for Engineering is Like Medicine for People](#safety-for-engineering-is-like-medicine-for-people)
 163 |   - [User Interfaces and Critical Systems](#user-interfaces-and-critical-systems)
 164 | - [Books](#books)
 165 | - [Similar resources](#similar-resources)
 166 | - [Copyright](#copyright)
 167 | 
 168 | <!-- END doctoc generated TOC please keep comment here to allow auto update -->
 169 | 
 170 | ## Day-to-Day Work of a Software Engineer
 171 | 
 172 | ### Leave Work Better: Improving Today for a Simpler Tomorrow
 173 | 
 174 | Always leave the work artifacts – whether code, documentation, diagrams, models,
 175 | or others – in a better state than they were before, giving future you or
 176 | someone else the opportunity to improve them even further.
 177 | 
 178 | ### Fast Feedback
 179 | 
 180 | Fast feedback is essential for making progress and avoiding wasted effort. It
 181 | helps engineers quickly test ideas, catch mistakes early, and stay in the flow.
 182 | Useful ways to get fast feedback include test-driven development, fast-running
 183 | test suites, effective debugging tools, and simply asking a colleague for quick
 184 | advice. When starting on a new project, one of the first things to learn is how
 185 | to run the existing tests, write new ones, and figure out the quickest way to
 186 | debug. Investing in faster tools, clearer error messages, and smoother processes
 187 | pays off – the shorter the feedback loop, the more confidently and efficiently
 188 | you can work.
 189 | 
 190 | ### Start Simple
 191 | 
 192 | Start with something simple, then extend it further. Most often a complex
 193 | problem is a composition of simpler problems. If you are facing a problem and
 194 | you are afraid of the complexity it exerts, try to make a smallest possible step
 195 | towards the solution and see what you can do from there. Simple can also mean
 196 | quick and dirty but that's ok as that's only a start. Once you have something
 197 | simple working you have a ground to move on further. Most likely this means you
 198 | have an **archetype** of a future thing, real and complex system.
 199 | 
 200 | See also Kent Beck's
 201 | [Test-Driven Development book](https://en.wikipedia.org/wiki/Test-Driven_Development_by_Example)
 202 | where this approach of doing simple things is explained at great depth.
 203 | 
 204 | ### Look Outside Your Immediate Task, Maintain the Bigger Picture
 205 | 
 206 | - When starting any task, take time to understand the rationale behind it (the
 207 |   WHY).
 208 | - See how the task connects to broader goals, milestones, or parallel efforts.
 209 |   It may be part of a chain where upstream or downstream effects matter.
 210 | - Maintain awareness of the bigger picture. A task that seems minor may be a
 211 |   critical blocker for a more visible effort. Conversely, something that appears
 212 |   simple might turn out to be time-consuming and affect teammates or
 213 |   dependencies.
 214 | - With a deep understanding of the task, you will start seeing how different
 215 |   strategies (e.g., Strategy X vs. Strategy Y) can lead to different outcomes.
 216 |   This kind of insight allows you to:
 217 | 
 218 |   - Escalate risks early.
 219 |   - Spot opportunities.
 220 |   - Align better with the system's or team's needs.
 221 | 
 222 | A practical application of this mindset in documentation writing: start each
 223 | technical page with a clear problem statement and a description of its
 224 | surrounding context.
 225 | 
 226 | - Who or what benefits if this task is completed?
 227 | - Does it enable a system, a process, or a team?
 228 | - What is the strategic value of solving it?
 229 | 
 230 | Framing the problem this way helps readers, especially future engineers, orient
 231 | themselves and understand the significance of the solution that follows.
 232 | 
 233 | ### Avoid Work That Can Be Avoided
 234 | 
 235 | Before starting or planning any work, always ask: Is this work truly necessary?
 236 | 
 237 | Sometimes, tasks are initiated based on uninformed decisions, leading to work
 238 | that ultimately provides little value or fails to achieve the desired outcomes.
 239 | 
 240 | "Busy work" refers to inefficient tasks that consume time and resources while
 241 | contributing little to the project's success. It can compromise schedules,
 242 | reduce technical consistency, lower team morale, and create the illusion of
 243 | progress. The ability to recognize and eliminate busy work is one of the skills
 244 | that distinguishes a senior engineer from a junior one.
 245 | 
 246 | Engineering is sometimes cheating. Instead of implementing something
 247 | sophisticated, a smarter workaround can achieve the same result with far less
 248 | effort. For example, rather than building a solution from scratch, reuse
 249 | existing work – whether by leveraging open-source software or buying an
 250 | off-the-shelf system.
 251 | 
 252 | In software development, there's a well-known saying: "The best code is the code
 253 | that is never written".
 254 | 
 255 | ### Understand and Respect the Customer
 256 | 
 257 | Take time to deeply understand and respect the customer, both the people and the
 258 | domain they operate in. Immerse yourself in their context. Know what they care
 259 | about, what problems they face, and how your work fits into their world.
 260 | 
 261 | When things go smoothly, this understanding helps you deliver real value. When
 262 | things get challenging, such as when delays or technical setbacks arise, this
 263 | relationship matters even more.
 264 | 
 265 | In such situations, transparency is better than defensiveness. A clear and
 266 | honest update, even when delivering bad news, builds trust. Customers almost
 267 | always prefer being informed early over being surprised later. A transparent
 268 | explanation of issues, trade-offs, and risks shows respect for their time,
 269 | planning, and decision-making.
 270 | 
 271 | ### Choose Where to Innovate (Carefully)
 272 | 
 273 | Innovate where your business's focus lies and stay conservative with other areas
 274 | by using established technologies.
 275 | 
 276 | For example, a company focused on rocket software should likely avoid building
 277 | its own web framework or NoSQL database. Exceptions exist, but they are rare,
 278 | especially when a company diversifies into a highly successful product unrelated
 279 | to its core business. Innovating in too many areas can compromise the core
 280 | product and cause missed deadlines.
 281 | 
 282 | For a great explanation, refer to this
 283 | [Boring Technology presentation](https://boringtechnology.club/).
 284 | 
 285 | ### Automate everything
 286 | 
 287 | Seek opportunities to automate processes or tasks. Automation eliminates busy
 288 | work, freeing time for more valuable activities. It reduces human error,
 289 | increases efficiency, and helps to maintain consistency. The best workflows are
 290 | automated ones.
 291 | 
 292 | ### Quick exploration
 293 | 
 294 | The solution you're looking for might be just two clicks and a couple of Google
 295 | searches away.
 296 | 
 297 | When reading large documents, it can be helpful to "fly over" them to quickly
 298 | locate the most relevant section rather than reading from A to Z.
 299 | 
 300 | When exploring with code, a combination of quick-and-dirty scripts can sometimes
 301 | create miracles, giving immediate and valuable insights. Instead of discarding
 302 | an idea because it's complex and time-consuming, try implementing a very basic
 303 | version first because it might provide useful insights or even a functional
 304 | solution right away.
 305 | 
 306 | ### Task Sequencing: Group Related Activities for Efficiency
 307 | 
 308 | When sequencing tasks (especially repetitive ones), group related tasks together
 309 | and separate them from others.
 310 | 
 311 | One useful pattern is the 'Inbox' approach, where input is first collected and
 312 | then executed upon. For example, when writing a technical document, split the
 313 | task of gathering the document content (the 'Inbox' with bullet points) from the
 314 | task of formulating and spelling out each individual content item.
 315 | 
 316 | ### Strive for Clarity
 317 | 
 318 | Strive for clarity in everything you do. Put in the effort to make the products
 319 | of your work, or the aspects of the system you're working on, as clear as
 320 | possible. Simplify complexity – either by reducing the complexity itself through
 321 | development or, if that's not feasible, by explaining the details as clearly as
 322 | possible.
 323 | 
 324 | Avoid owning too many non-obvious details about your work that only you
 325 | understand. Do not hold onto esoteric knowledge – de-esoterize it. Document it
 326 | for everyone to access.
 327 | 
 328 | Encyclopedism or esotericism is an anti-pattern because it obscures common
 329 | knowledge about the system for others.
 330 | 
 331 | - Document everything, especially the most complex topics.
 332 | - Use plain English and diagrams to explain complex topics to your colleagues.
 333 |   Test your content with them to ensure it is accessible. If it's still unclear,
 334 |   ask for their feedback to improve it.
 335 | 
 336 | ### Everything Explicit. No Magic.
 337 | 
 338 | Whenever you face a choice between explicit and magic, always choose explicit.
 339 | 
 340 | "Magic" is a term software engineers use for anything that is non-obvious,
 341 | hidden, overly complex, or no longer suited to the system's current state.
 342 | 
 343 | Making things explicit requires a constant effort to ensure clarity, so that
 344 | others can understand your work without extra effort. A good test for
 345 | explicitness is whether understanding is immediate, with no mental effort or
 346 | blockers when going through the material.
 347 | 
 348 | ### Close the loops, acknowledge communication
 349 | 
 350 | A "loop" refers to any situation where one action is followed by another that
 351 | resolves the first action in some way. Often, these loops are explicitly called
 352 | "feedback loops" because they are closed with feedback that resolves an
 353 | outstanding action or state, such as marking it Done, OK, ACK, or something
 354 | similar.
 355 | 
 356 | Loops can exist in both developed systems and producing organizations.
 357 | 
 358 | Examples of loops:
 359 | 
 360 | - Answering an email from an existing email thread closes the loop created by
 361 |   that thread.
 362 | - Closing a Pull Request finalizes its status, either as Done or Won't do.
 363 | - Closing a work item ticket to Done.
 364 | 
 365 | A task manager is an excellent tool for tracking work items that need to be
 366 | completed and closed. For tracking non-trivial project development topics and
 367 | trade-offs, a useful practice is to maintain an "Open Questions Log" – a table
 368 | where each unresolved or unclosed item is tracked by its current status until it
 369 | is resolved.
 370 | 
 371 | Sometimes a loop may never be closed, or it may be closed with a significant
 372 | delay. Both scenarios can lead to potential problems or even hazards, depending
 373 | on the type of system being developed.
 374 | 
 375 | Note that 'Won't-do' is also a valid way to close the loop. For example, closing
 376 | a Jira ticket with "Won't do" or "Won't fix" positively acknowledges that this
 377 | work will no longer linger in someone's backlog.
 378 | 
 379 | Not closing loops is often bad practice. Some examples include:
 380 | 
 381 | - Not answering an email can cause project delays or result in the
 382 |   implementation of a broken or inconsistent system, leading to incidents or
 383 |   accidents in the future.
 384 | - A missed or forgotten chat message may mean important information is never
 385 |   delivered to a critical person.
 386 | - A manager neglecting to follow up on an important topic raised by employees,
 387 |   leaving it unresolved in an inbox without due attention.
 388 | 
 389 | ### Learn from Lessons
 390 | 
 391 | Do something, then learn from experience. Don't forget – take deliberate time to
 392 | reflect. The industry has developed several best practices for capturing lessons
 393 | learned:
 394 | 
 395 | - Standards: Organizational and industry knowledge is captured in standards,
 396 |   handbooks, guidelines, and best practices.
 397 | - Post-mortems: When something goes wrong, those involved produce a structured
 398 |   report about the event. Larger companies maintain databases of critical
 399 |   incidents that employees can study to educate themselves.
 400 | - Debriefs: After a meeting, the group discusses what went well or wrong.
 401 | - Lessons learned documentation and meetings: After completing an important
 402 |   activity, such as a project or milestone, the team takes time to reflect on
 403 |   what went well or wrong, learn from it, and document the findings.
 404 | 
 405 | Learning doesn't have to be only organizational – it can also be personal.
 406 | 
 407 | Examples:
 408 | 
 409 | - If a project was successful, what made it so? If a project failed, what were
 410 |   the key contributing factors? How can it be improved next time?
 411 | - Learning how to estimate software work better – what if a task was estimated
 412 |   to take X weeks but actually took 3X? Wouldn't it be valuable to improve
 413 |   estimation skills?
 414 | - If one colleague is significantly more effective than another, what makes them
 415 |   so? What tools, techniques, or habits contribute to their efficiency? Can
 416 |   something be learned from them?
 417 | - Observing bugs missed during code reviews – what types of bugs tend to escape
 418 |   static analysis or peer review? What patterns can be identified to prevent
 419 |   them in the future?
 420 | 
 421 | ### Use Diagrams
 422 | 
 423 | Use diagrams as part of your daily work. A diagram can often explain far more
 424 | than several paragraphs of text.
 425 | 
 426 | Use diagrams for:
 427 | 
 428 | - Prototyping and documenting software
 429 | - Pair programming
 430 | - Hardware-software integration testing
 431 | - Meetings (including external meetings)
 432 | - Onboarding colleagues
 433 | - Everything else where a good visualization helps
 434 | 
 435 | There are standards and conventions for creating diagrams, such as UML, but in
 436 | practice, even very basic diagrams can be incredibly useful. Use simple shapes
 437 | like rectangles and arrows, avoid excessive colors or different shapes, and
 438 | express your concepts with the fewest visual elements possible. Creating
 439 | diagrams that are too visually complex hinders understanding and reduces their
 440 | effectiveness.
 441 | 
 442 | ## Communication and Teamwork
 443 | 
 444 | ### Agile Software Development Requires Strong Social Network
 445 | 
 446 | **Agile Software Development Requires Strong Social Network**. This statement is
 447 | a generalization: This idea has been there from the beginning and since the
 448 | inception of the [Agile Manifesto](https://agilemanifesto.org/), but the
 449 | following quote from Kent Beck helps to pinpoint it very clearly:
 450 | 
 451 | > In The Forest (more specifically on an XP-style team), we handle communication
 452 | > of design & implementation multiple ways:
 453 | >
 454 | > - Communicative code.
 455 | > - Readable & predictive tests.
 456 | > - A strong social network.
 457 | >
 458 | > It's only when there is a large audience for stable information (such as the
 459 | > JUnit API) that we resort to separate documentation.
 460 | 
 461 | See
 462 | [Kent Beck - Anatomy of Oscillation](https://tidyfirst.substack.com/p/anatomy-of-oscillation).
 463 | 
 464 | ### Sending Status Updates to the Team
 465 | 
 466 | Software engineering teams often communicate daily via chat. A proven pattern is
 467 | for each team member to send updates about their work, allowing the entire team
 468 | to see these messages.
 469 | 
 470 | Examples of such messages include:
 471 | 
 472 | - "Task X is done, here's the PR link. @A and @B, could you take a look?"
 473 | - "This week my focus is... Next, I am going to work on..."
 474 | - "I see your PR, but I'm working on something else."
 475 | - "What does the team think about introducing the coding convention ABC?"
 476 | 
 477 | While this may seem obvious for some teams, there are others where daily chats
 478 | are completely silent, reflecting a lack of communication between peers
 479 | throughout the day. When messages are exchanged, it creates a certain "pulse"
 480 | within the team, signaling that the group is actively working on meaningful
 481 | tasks and is open to discussion, iteration, and improvement.
 482 | 
 483 | This activity not only serves an informational purpose (increasing awareness)
 484 | but also has learning, motivational, and even entertaining aspects.
 485 | 
 486 | ### Keep Everyone in the Loop
 487 | 
 488 | Share regular updates with the people who rely on your work: your manager,
 489 | teammates, or anyone following your technical progress. In fast-moving projects,
 490 | keeping others informed helps avoid surprises and keeps everyone aligned.
 491 | 
 492 | In an office setting, updates often happen naturally. If the team is
 493 | well-connected, these updates may happen through casual conversations or small
 494 | talk over lunch. This kind of informal communication spreads useful information
 495 | without needing formal meetings.
 496 | 
 497 | One big advantage: by the time your work reaches a review—like a code review,
 498 | documentation review, or a project milestone—people will already know about it
 499 | and may have given input earlier. This makes reviews faster, smoother, and less
 500 | stressful.
 501 | 
 502 | Another reason to talk about your work: visibility and recognition. Others might
 503 | not know:
 504 | 
 505 | - what challenges you re facing
 506 | - how long something might take
 507 | - how your work connects to theirs.
 508 | 
 509 | Your teammates are often busy with their own tasks. Clear communication helps
 510 | them understand what you are doing and helps your work get noticed and
 511 | appreciated.
 512 | 
 513 | Stay connected. Stay aligned.
 514 | 
 515 | ### Recognize the ideas and achievements of your colleagues
 516 | 
 517 | Teamwork involves contributions from all team members. Whether you are a leader
 518 | or an individual contributor, it is essential to give credit where it's due when
 519 | expressing an idea that you know was authored by someone else.
 520 | 
 521 | This is a good practice because it fosters trust and respect within the team,
 522 | encouraging open collaboration and the free exchange of ideas. Recognizing
 523 | others' contributions also boosts morale, motivates continued input, and
 524 | strengthens the overall effectiveness of the team.
 525 | 
 526 | An anti-pattern is when the names of the original authors are omitted, and the
 527 | work is presented in the first person, either intentionally or unintentionally,
 528 | as if the content were one's own.
 529 | 
 530 | ### Professional content
 531 | 
 532 | When writing an email or chat message, even if addressed to a select group,
 533 | consider composing it in a way that it would remain professional and consistent
 534 | if shared with a larger or unintended audience. Avoid using vague references
 535 | like "we" and "they", especially when referring to internal teams or external
 536 | parties such as customers. Refrain from using negative sentences or excessive
 537 | emotion. Your content should be polished and ready to be forwarded by anyone, at
 538 | any time, whether intentionally or unintentionally.
 539 | 
 540 | ### Loop in Experts for Important Actions
 541 | 
 542 | When making an important decision, involve the right experts. It is better to
 543 | include too many people than to miss someone who should have been part of it.
 544 | 
 545 | If you are writing an email or message that speaks for your team or group, check
 546 | it with others first. Make sure the message reflects what everyone agrees on.
 547 | 
 548 | When a message is aligned like this, it:
 549 | 
 550 | - Stays strong even if people question it.
 551 | - Builds trust inside and outside the team.
 552 | - Shows that the team is working together.
 553 | 
 554 | Taking the time to check with others makes your message clearer and more
 555 | powerful in the long run.
 556 | 
 557 | ## Complexity and Cognitive Load
 558 | 
 559 | > "Complexity can be defined as intellectual unmanageability" (Nancy Leveson,
 560 | > Engineering a Safer World, p.4)
 561 | 
 562 | https://en.wikipedia.org/wiki/Cognitive_load (and Cognitive Overload)
 563 | 
 564 | ### Solving Right Problems
 565 | 
 566 | "Engineers are great at solving problems but they are not always great at
 567 | identifying the right problems to be solved" (Dr. John Thomas, ESWC 2019).
 568 | 
 569 | ### Solutions are Context-Driven
 570 | 
 571 | Even the best solution to a problem is valid only within a given context. A
 572 | slight change in the context can invalidate the solution, requiring one to start
 573 | from scratch. This understanding highlights that no solution is universally
 574 | perfect. Instead, solutions address specific problems or contexts in an "optimal
 575 | enough" way. It also encourages detachment from ego-driven perfection, allowing
 576 | solutions to evolve as the environment changes.
 577 | 
 578 | Examples:
 579 | 
 580 | - A clean architecture or pattern may shift to a completely different, sometimes
 581 |   opposite, solution due to changing requirements or system environments.
 582 | - A "perfect" solution might be discarded because a new team or team leader
 583 |   dislikes technology X and prefers technology Y, or simply because it aligns
 584 |   with emerging industry trends.
 585 | - Perfectly clean code may be rewritten and become more obfuscated due to
 586 |   necessary performance optimizations.
 587 | - Highly efficient code might be rewritten to sacrifice performance in favor of
 588 |   better maintainability and readability, especially for a larger team.
 589 | 
 590 | ### Weakest link
 591 | 
 592 | A piece of information is only as clear as its most ambiguous piece. This is a
 593 | generalisation from the following fragment from "Patterns for Writing Effective
 594 | Use Cases" by Steve Adolph et al., Chapter 6.6:
 595 | 
 596 | > Like the old proverb, "A chain is only as strong as its weakest link", a use
 597 | > case is only as clear as its most ambiguous step.
 598 | 
 599 | ### Point of View
 600 | 
 601 | [How NASA Builds Teams](https://www.wiley.com/en-us/How+NASA+Builds+Teams%3A+Mission+Critical+Soft+Skills+for+Scientists%2C+Engineers%2C+and+Project+Teams-p-9780470456484):
 602 | 
 603 | > The right coordinate system can turn an impossible problem into two really
 604 | > hard ones.
 605 | 
 606 | [The Early History Of Smalltalk](https://worrydream.com/EarlyHistoryOfSmalltalk/)
 607 | 
 608 | > Watching a famous guy much smarter than I struggle for more than 30 minutes to
 609 | > not quite solve the problem his way (there was a bug) made quite an
 610 | > impression. It brought home to me once again that "point of view is worth 80
 611 | > IQ points." I wasn't smarter but I had a much better internal thinking tool to
 612 | > amplify my abilities. This incident and others like it made paramount that any
 613 | > tool for children should have great thinking patterns and deep beauty
 614 | > "built-in."
 615 | 
 616 | ### Periphery
 617 | 
 618 | If your reasoning is hindered by cognitive overload while trying to solve a
 619 | problem, and there's no clear first step toward a solution, take a step back and
 620 | start working with the Periphery. By cleaning up the periphery, you'll often
 621 | find that the core problem becomes clearer and more approachable.
 622 | 
 623 | A good example is legacy code: issues in the periphery, such as poor variable
 624 | names, incorrect class responsibilities (even those distant from your immediate
 625 | problem), or a disorganized folder structure, may seem irrelevant to the core
 626 | issue. However, they still contribute to the cognitive overload. Fixing them
 627 | will help clear the path for your actual work.
 628 | 
 629 | Another word for Periphery is Background, see also
 630 | [Deconcentation of Attention](http://deconcentration-of-attention.com/).
 631 | 
 632 | ### Rational and Unconscious
 633 | 
 634 | Engineers create rational artifacts that may appear simple and mundane. However,
 635 | the process behind their creation often involves deep reflection and can stem
 636 | from the unconscious mind.
 637 | 
 638 | ### Humans are not designed for Big Numbers
 639 | 
 640 | If you have to work with something that involves a big number of entities, like
 641 | do something on 10000 files or work with megabytes of data, start with reducing
 642 | this quantity to a minimum possible number of entities so that still makes sense
 643 | for a prototype of your final work: make it work with 1 file instead of 10000 or
 644 | with 20 bytes instead of 20 gigabytes.
 645 | 
 646 | ### There is no such thing as Many
 647 | 
 648 | Many does exist but it is difficult to cognize with a human mind. Many needs an
 649 | Umbrella, that turns it into One in the way we think about it. Many can be
 650 | homogenous like Array of objects of the same type or heterogeneous, for example
 651 | a bunch of instructions in the code or multiple functions in a test class or a
 652 | set of User Profile fields of various types: name (string), age (int), settings
 653 | (object). Collections are easier because they hide Many from us behind a
 654 | well-defined interface: `containsObject`, `getAtIndex`, `enumerateWithIndex`,
 655 | which saves us from dealing with Many directly. Heterogeneous Many is harder:
 656 | you have to cognize and organize it yourself: group instructions into meaningful
 657 | functions, group fields into meaningful containers like structs or database
 658 | tables.
 659 | 
 660 | One programming construct that fails to constrain Many is tuple: you start doing
 661 | things like `let person = ("John", 32)` and `let (name, age) = person` or things
 662 | like `person.1` but then you quickly find yourself in a mess when the number
 663 | grows to a real Many (quick lesson: don't use tuples, use structs!). If you have
 664 | Many, find a way to think and work with it like One.
 665 | 
 666 | ### 0-1-2-Many I
 667 | 
 668 | Most of the people start saying "so many", "infinite" when there is actually 3
 669 | or 4, rarely more, things on the table. Variation is 1a, 1b, 2a, 2b which is
 670 | still within limit of 3 or 4. This looks like ancient calculator: when 0, 1, 2
 671 | and then 'many'. Algebra looks fairly simple: 0 + 1 = 1, 1 + 1 = 2, 2 + 1 =
 672 | many, 2 + 2 = many, etc. Consequence: people are quite susceptible to small
 673 | numbers. Say something like "this consists of 3 steps" and people will get it.
 674 | Don't say "seven". See also **Humans are not designed for Big Numbers**.
 675 | 
 676 | ### 0-1-2-Many II
 677 | 
 678 | Don't start to abstract or DRY from just two things. Wait until you have at
 679 | least 3 of them. See also **Duplication is better than poor abstraction**.
 680 | 
 681 | ### Masking (Shadowing)
 682 | 
 683 | Masking/shadowing of all kinds is dangerous and should be avoided or treated
 684 | with a great care.
 685 | 
 686 | Examples:
 687 | 
 688 | - errors introduced to the systems when overlapping requirements are implemented
 689 |   over time
 690 | - masking in MC/DC
 691 | - shadowing of variable declarations
 692 | - typographically ambiguous symbols with overlapping visibility like `l` and
 693 |   `1`, `O` and `0` (see MISRA guidelines)
 694 | - code reviews: real bugs can hide behind less important but more noticeable
 695 |   issues like typos or coding style details
 696 | - bugs often hide themselves behind complexity
 697 | 
 698 | See also Overlapping Control.
 699 | 
 700 | ## Design
 701 | 
 702 | ### Poor Abstraction
 703 | 
 704 | > Duplication is better than poor abstraction (Sandi Metz, Rails Club 2014,
 705 | > Moscow).
 706 | 
 707 | > "...ill-fitting structure is worse than none..." (Eric Evans - Domain-Driven
 708 | > Design, p.446)
 709 | 
 710 | A good example from https://www.sigbus.info/worse-is-better:
 711 | 
 712 | > In lld v2, we decided not to use an intermediate representation. Instead, we
 713 | > directly handle platform-dependent native file formats. lld v2 consists of
 714 | > virtually three different linkers for Windows, macOS and Unix. They share the
 715 | > same design but do not share code. Naturally, we sometimes had to write very
 716 | > similar code for each target. This may seem like an amateur-level programming
 717 | > mistake, but in reality, it's much easier to write straightforward code for
 718 | > each target than writing unified one that covers all the details and corner
 719 | > cases of all supported targets simultaneously.
 720 | 
 721 | ### Cost of Abstraction
 722 | 
 723 | Software engineering often involves creating abstractions. A solution to a
 724 | problem can include more or fewer abstractions, but each introduced abstraction
 725 | comes with a cost. This cost manifests as the cognitive burden placed on those
 726 | who need to understand, maintain, and document it – not just in code, but also
 727 | in models, documentation, and even organizational structures.
 728 | 
 729 | Cognitively, an abstraction can be thought of as a mental gadget that one must
 730 | "install" in order to work with it. Imagine an empty room that needs to be
 731 | furnished according to a specific use case. If the chosen abstractions fit well
 732 | within the team's mental model, the space remains functional – like a
 733 | well-furnished room where people can move freely and use it as intended.
 734 | However, if abstractions are difficult to grasp or combine in contradictory
 735 | ways, the mental space becomes cluttered, leaving little room to maneuver. This
 736 | is similar to a room overloaded with furniture, making it difficult to navigate
 737 | or even understand its intended purpose.
 738 | 
 739 | For example, if a team introduces a new abstraction X, it incurs the following
 740 | costs:
 741 | 
 742 | - Every developer must understand and adopt X to work effectively within the
 743 |   system.
 744 | - The system must be structured around X in a way that ensures maintainability
 745 |   over time.
 746 | - Long-term maintenance will require keeping the code, file structure, and
 747 |   models aligned with X, often introducing additional overhead.
 748 | 
 749 | Introducing too many incompatible abstractions – or a few abstractions that
 750 | consume too much of the decision space – can quickly lead to over-engineering.
 751 | Those responsible for maintaining such systems often find themselves
 752 | disentangling unnecessary complexity, seeking a new balance that restores
 753 | manageability by replacing or introducing more adequate abstractions.
 754 | 
 755 | ### Habitability
 756 | 
 757 | Habitable software is better than perfect software.
 758 | 
 759 | [Richard Gabriel - Patterns of Software, Habitability and Piecemeal Growth](https://www.dreamsongs.com/Files/PatternsOfSoftware.pdf).
 760 | 
 761 | > Habitability is the characteristic of source code that enables programmers,
 762 | > coders, bug-fixers, and people coming to the code later in its life to
 763 | > understand its construction and intentions and to change it comfortably and
 764 | > confidently. Either there is more to habitability than clarity or the two
 765 | > characteristics are different...
 766 | 
 767 | > ...Habitability makes a place livable, like home. And this is what we want in
 768 | > software – that developers feel at home, can place their hands on any item
 769 | > without having to think deeply about where it is. It's something like clarity,
 770 | > but clarity is too hard to come by.
 771 | 
 772 | ### Hard Feature
 773 | 
 774 | If a feature is hard to implement it might indicate that it is something wrong
 775 | with the feature (or product).
 776 | 
 777 | ### True Name
 778 | 
 779 | If you know [True Name](https://en.wikipedia.org/wiki/True_name) of something
 780 | you have power over it. Good class name - this is what True Name is in OOP.
 781 | 
 782 | > "A well-chosen word can save an enormous amount of thought", (said by Mach
 783 | > according to S.R.Cajal, Santiago Ramón y Cajal, "Advice for a young
 784 | > investigator")
 785 | 
 786 | See also
 787 | [Mass and Gravity](http://www.carlopescio.com/2008/12/notes-on-software-design-chapter-2-mass.html).
 788 | 
 789 | ### One Pattern per Class
 790 | 
 791 | A class violates Single Responsibility Principle if it contains implementation
 792 | of more than one design pattern. Of course there are exceptions.
 793 | 
 794 | ### Archetype
 795 | 
 796 | Archetype is an umbrella concept for other concepts like: `prototype`,
 797 | `proof of concept`, `minimal viable product`. Archetype means something simple
 798 | and coherent. If you know the archetype of something you understand the essence
 799 | of it. A complex system can be traced back to a one or a number of underlying
 800 | archetypes.
 801 | 
 802 | Interesting side note: as far as I see it, the tendency is that engineers as
 803 | they grow their software bigger, do not care much about the underlying
 804 | archetypes. Imagine how easy it would be to learn about the software if it would
 805 | contain itself in its earliest forms of being (source code, documentation,
 806 | drafts etc). Great example: Rust programming language had to start from
 807 | [somewhere](https://github.com/graydon/rust-prehistory).
 808 | 
 809 | > "View the problem in its simplest forms ... An excellent method for
 810 | > determining the meaning of something is to find out how it comes to be what it
 811 | > is." (Santiago Ramón y Cajal, "Advice for a young investigator")
 812 | 
 813 | ### Prima Materia
 814 | 
 815 | Sometimes to make further progress you need to un-implement (break!) particular
 816 | pattern/architecture/solution and put it back into
 817 | [Prima Materia](https://en.wikipedia.org/wiki/Prima_materia) state and only then
 818 | thansform it into a something new. Metaphors similar to Prima Materia are
 819 | "primordial soup" and "indifferentiated soup of ideas" (Eric Evans - DDD).
 820 | 
 821 | ### Mature automation
 822 | 
 823 | Mature automation allows itself to be observed, inspected, and overridden. Even
 824 | if something is automated and usually works well, there should always be a way
 825 | to turn it off or adjust it when needed. Good automation is transparent – you
 826 | can see what it is doing, understand how it works, troubleshoot problems, and
 827 | make changes if necessary. In some situations, it is important to bypass
 828 | automation entirely and take manual control or use an alternative path. Systems
 829 | that do not allow this create unnecessary friction and risk. Automation should
 830 | support people, not trap them.
 831 | 
 832 | ### "Magic" is automation that is not adequate
 833 | 
 834 | In the beginning, there is no magic, but simply a desire to automate things to
 835 | reduce repetition. Magic appears as a result of increasing complexity that makes
 836 | current solution to be inadequate for further progress. Magic can also emerge
 837 | rather quickly as a result of automating wrong things from the beginning. The
 838 | holy grail is automation that is always adequate.
 839 | 
 840 | ### Poisonous Systems
 841 | 
 842 | Badly designed systems tend to poison systems they interact with.
 843 | 
 844 | ### Bad Design in House
 845 | 
 846 | Do not overdesign your own software if you have a big producer of bad or too
 847 | opinionated designs nearby. A big producer can be a vendor or a team with
 848 | authority who decided to rely on a given design a while ago.
 849 | 
 850 | ### Trade-off of Encapsulation
 851 | 
 852 | Strong, "tight", encapsulation is good but don't forget about the users:
 853 | Operations people. Good example is debugging facilities - if you close
 854 | everything then you leave the ops people, who might be you, without any tools to
 855 | understand or tweak your system. Richard Cook explains this very well: See
 856 | [Velocity 2012: Richard Cook, "How Complex Systems Fail"](https://www.youtube.com/watch?v=2S0k12uZR14).
 857 | 
 858 | ### Unnecessary Flexibility
 859 | 
 860 | (from [Writing Solid Code](http://writingsolidcode.com/))
 861 | 
 862 | > Flexibility breeds bugs. Another strategy you can use to prevent bugs is to
 863 | > strip unnecessary flexibility from your designs... The trouble with flexible
 864 | > designs is that the more flexible they are, the harder it is to detect bugs.
 865 | 
 866 | > ...Flexible features are troublesome because they can lead to unexpected
 867 | > "legal" situations that you didn't think to test for even realize were
 868 | > legal...
 869 | 
 870 | > ...When you implement features in your own projects, make them easy to use;
 871 | > don't make them unnecessary flexible. There is a difference. Don't allow
 872 | > unnecessary flexibility.
 873 | 
 874 | ### Black Box with a Green Play Button
 875 | 
 876 | Ideal interface for a system of arbitrary complexity is a black box with a green
 877 | play button on it - you take the box, press green button and it just works. The
 878 | second ideal interface is when you also have a red button to stop the system.
 879 | 
 880 | ### Single Source Concept and Its Exceptions
 881 | 
 882 | The Single Source (of Truth) concept is one of the first principles beginner
 883 | programmers learn and often becomes a rule they follow rigorously. However, like
 884 | many principles in life, it has its exceptions. Blindly adhering to the Single
 885 | Source rule can sometimes lead to suboptimal results.
 886 | 
 887 | A good example of when this principle might fail is the
 888 | [Poor Abstraction](#poor-abstraction) scenario. This happens when someone tries
 889 | to consolidate similar elements into a single source while ignoring their
 890 | significant differences. In such cases, forcing everything into one place can
 891 | create an abstraction that is brittle, confusing, or overly complex, ultimately
 892 | making the system harder to understand and maintain.
 893 | 
 894 | Another example is
 895 | [Two Almost Identical Entities](#two-almost-identical-entities). This occurs
 896 | when someone tries to merge two seemingly identical entities into one, which
 897 | results in an overly complicated "Single Source of Truth" codebase. This
 898 | approach often leads to significant branching logic and reduced readability,
 899 | making the code harder to work with and more prone to errors.
 900 | 
 901 | Understanding when to apply the Single Source principle and when to allow for
 902 | exceptions is crucial for achieving balance and maintaining flexibility in
 903 | software design. Learning where to follow and where to de-prioritize the Single
 904 | Source principle is a good skill that distinguishes a more experienced
 905 | programmer from a beginner one.
 906 | 
 907 | ### Resilience to Change vs Fixed Perfect Solutions
 908 | 
 909 | When designing a system, there is a trade-off between making it easier to change
 910 | in the future and striving for perfection. In most cases, choosing flexibility
 911 | is the better option. If you anticipate changes in context or additional
 912 | development work that could affect the system, avoid focusing too much on
 913 | perfecting the existing solution, as it may not hold up under new pressures.
 914 | Another important consideration is the ability to undo or disable a function
 915 | that works perfectly now but could cause unforeseen issues in operation. Often,
 916 | a perfectly working solution can create obstacles for other systems or people
 917 | involved in operating the system.
 918 | 
 919 | ### Two Almost Identical Entities
 920 | 
 921 | Over the years I have seen at least three big units of a hardly manageable
 922 | legacy code where each of them was built on two almost identical entities. There
 923 | are two ways of such things to co-exist:
 924 | 
 925 | 1. One is a subclass of the other.
 926 | 2. Two almost identical hierarchies are maintained.
 927 | 3. Two groups of helper functions without a clear separatation of
 928 |    responsibilities between them.
 929 | 
 930 | It seems that historically in all three cases it started with one entity that
 931 | accumulated its features along the way, then came the other which was so similar
 932 | to the first that programmer avoided extraction of similar modules that both
 933 | entities had and went with subclassing to get the result quickly or with 2
 934 | parallel hierarchies.
 935 | 
 936 | To these days I still didn't see or create an elegant solution to this problem.
 937 | See also "Hard Feature".
 938 | 
 939 | ### Control
 940 | 
 941 | One of the key concerns is Control: where control should or should not be, what
 942 | should have control (be active) and what should not have (passive).
 943 | 
 944 | #### Observable Control
 945 | 
 946 | Software should be designed in such a way that there always should be a
 947 | dedicated place where it is obvious how the control and work flow through the
 948 | software. This should be effective on all levels of abstraction and for each
 949 | level of abstraction, such dedicated software should be free of the lower-level
 950 | implementation details that discourage easy understanding of context.
 951 | 
 952 | If something creates a low-level implementation noise on a given level, it might
 953 | be a good sign that one or more underlying lower layers exist where that
 954 | lower-level implementation can be represented as a high-level workflow logic
 955 | (sequence of steps or algorithm).
 956 | 
 957 | #### Humans should dominate machines
 958 | 
 959 | The lower-level modules should not have control over higher-level modules. It is
 960 | not only about not having higher-level module imported in lower-level modules
 961 | and making everything to work through protocols/interfaces but more about what
 962 | is the flow of control: "what controls what". Two shortcuts: **humans should
 963 | dominate machines**, **business logic should dominate the system's
 964 | implementation details**.
 965 | 
 966 | #### Overlapping control
 967 | 
 968 | Overlapping things is a challenge for a human mind and therefore is bad for the
 969 | whole software lifecycle: design, development, testing and maintenance. This
 970 | might be two or more classes that do the same thing. This might be two or more
 971 | people whose responsibilities overlap. Nancy Leveson says Overlapping Control is
 972 | one of the greatest sources of safety problems: two controllers whose areas of
 973 | responsibilities overlap (see "Engineering a Safer World"). See also "Two almost
 974 | identical entities" and "Shadowing/Masking".
 975 | 
 976 | #### Broken control loops
 977 | 
 978 | The top-level controllers should always have a control over the bottom-level
 979 | elements. If the controllers include both humans and automation, the humans
 980 | should always be able to intervene and take over the control provided by the
 981 | automation.
 982 | 
 983 | This heuristic can be turned into explicit design constraint.
 984 | 
 985 | ### Feedback
 986 | 
 987 | #### Broken feedback loops
 988 | 
 989 | Missing, insufficient or incorrect feedback is a great source of troubles for
 990 | any system.
 991 | 
 992 | "All feedback loops must be closed" - this heuristic can be turned into explicit
 993 | design constraint.
 994 | 
 995 | ### Separation / partitioning
 996 | 
 997 | - Separate stable from unstable
 998 | - Separate permanent from temporary
 999 | - Separate synchronous from asynchronous
1000 | - Separate similar from different
1001 | - Separate symmetrical from asymmetrical
1002 | - Balance and symmetry: if one partition has way more items than the other ones,
1003 |   this may indicate that the partitioning has not been complete.
1004 | - Separate construction from operation (one example: Factory vs Command)
1005 | - Separate content from presentation (applies to UI-heavy code, great example:
1006 |   HTML/CSS)
1007 | - Separate easy from complex: isolate easy, isolate complex, repeat many times
1008 | - Separate stateless from stateful
1009 | - Separate data from behavior and behavior from data unless you do have a good
1010 |   OOP class/object with good data/behavior balance.
1011 | - Separate general-purpose from application-specific
1012 | - Separate application-level code from system-level code
1013 | - Separate methods that read from methods that write
1014 | - Separate decision from condition
1015 | - Separate One from Many, separate Many from Many.
1016 | 
1017 | Example 1: "Monolithic test case files"
1018 | 
1019 | In the following example the `_feature1_` or `_feature2_` parts and numbers in
1020 | the test method names assist a lot in logical grouping of the tested
1021 | functionality.
1022 | 
1023 | ```c
1024 | # Many group #1
1025 | test_feature1_1() {}
1026 | test_feature1_2() {}
1027 | test_feature1_3() {}
1028 | # Many group #2
1029 | test_feature2_1() {}
1030 | test_feature2_2() {}
1031 | test_feature2_3() {}
1032 | ```
1033 | 
1034 | Example 2: the inner block has a multiline routine which could actually be
1035 | another function that works on one. At the same time this inner block on many.
1036 | Unless we create that another function we have a conflict between many of the
1037 | enumeration and many of the instructions inside a block.
1038 | 
1039 | ```cpp
1040 | EnumerateInstructions(*function, [&](Instruction &instr, int bbIndex, int iIndex)
1041 | {
1042 |   ... lots of lines working on `instr` ...
1043 | });
1044 | ```
1045 | 
1046 | ### Grouping
1047 | 
1048 | - Group together things that change at the same time. If possible create
1049 |   container data structures so that a change involves a change of **one**. If
1050 |   possible, group all the changes that happen at the same time together.
1051 | 
1052 | - Group things that are used together.
1053 | 
1054 | ### Observability vs Correctness
1055 | 
1056 | Incorrect but observable code can be more valuable long-term than correct but
1057 | unobservable code. Observable code is easier to inspect, test, and improve, even
1058 | if it contains mistakes. In contrast, correct but hidden code can become
1059 | difficult to maintain and debug over time, creating technical debt. Visibility
1060 | allows for quicker fixes and ongoing improvement, making it more sustainable in
1061 | the long run.
1062 | 
1063 | ### Don't Use RAII on a Business Logic Level
1064 | 
1065 | RAII is good for resource management, such as handling memory, file handles, or
1066 | network connections, where resources need predictable acquisition and release.
1067 | However, applying RAII to business logic can lead to significant problems:
1068 | 
1069 | - Reduced flexibility: RAII assumes that actions are tied directly to scope, but
1070 |   business workflows may need to defer, combine, or otherwise manage actions
1071 |   independently of object lifetimes.
1072 | 
1073 | - Lack of transaction control: Business operations often involve external
1074 |   systems, validation, or rollback mechanisms that require precise control. RAII
1075 |   hides these processes behind object lifecycle management, making it harder to
1076 |   handle errors or maintain consistency.
1077 | 
1078 | - Unintended side effects: Business logic often involves workflows with complex
1079 |   rules and dependencies. Tying actions like adding or removing data to the
1080 |   lifecycle of objects can cause unexpected behaviors if those objects are
1081 |   destroyed prematurely or unintentionally.
1082 | 
1083 | - Debugging challenges: When business actions are implicitly triggered by object
1084 |   lifetimes, it becomes harder to trace when and why specific operations occur.
1085 |   This lack of clarity can lead to subtle bugs that are difficult to identify
1086 |   and fix.
1087 | 
1088 | Instead of using RAII, manage business logic explicitly through well-defined
1089 | methods or services. This approach keeps the logic transparent, easier to
1090 | understand, and more adaptable to changing requirements.
1091 | 
1092 | ## Coding, code reviews, and maintenance programming
1093 | 
1094 | ### Code that Works
1095 | 
1096 | Working code with a good-enough architecture is better than buggy code with a
1097 | perfect but overly complex architecture.
1098 | 
1099 | ### Code Is Not Your Partner
1100 | 
1101 | Sometimes, you don't have to be nice to code.
1102 | 
1103 | - It might be written for a different platform.
1104 | - It could be outdated or rely on ancient build tools.
1105 | - Some parts may be unnecessary for your needs.
1106 | - It may contain mistakes.
1107 | 
1108 | In such cases, it is perfectly fine to delete, modify, or hack the code – to
1109 | make it compile, test it, or simply understand how it works.
1110 | 
1111 | ### Two Strategies for Replacing a Feature
1112 | 
1113 | When replacing Feature A with Feature B, there are two broad approaches.
1114 | 
1115 | 1\. Remove A, Then Implement B
1116 | 
1117 | This strategy is best when:
1118 | 
1119 | - Feature A is simple.
1120 | - Feature B can be developed quickly.
1121 | - Switching to B is straightforward.
1122 | 
1123 | In such cases, removing A first and then building B works well, as the
1124 | transition is fast and manageable.
1125 | 
1126 | 2\. Develop B in Parallel, Switch from A to B, Remove A
1127 | 
1128 | This approach is necessary when the transition is complex or time-consuming.
1129 | Instead of removing A immediately, B is developed alongside it while the
1130 | existing system remains operational. The switch to B happens only when it is
1131 | fully developed and tested. A remains available as a fallback until B is proven
1132 | reliable, after which A can be removed.
1133 | 
1134 | This method is particularly useful when:
1135 | 
1136 | - Feature B requires significant development time.
1137 | - Switching from A to B is complex and requires a dedicated transition
1138 |   mechanism.
1139 | 
1140 | For already deployed systems where downtime is unacceptable, the second approach
1141 | is often the only viable way to ensure a smooth migration.
1142 | 
1143 | ### Smallest Scope
1144 | 
1145 | - Restrict the scope of data to the smallest possible. (The Power of 10: Rules
1146 |   for Developing Safety-Critical Code by NASA)
1147 | 
1148 | ### Code Style as a Blocker
1149 | 
1150 | Sometimes code style can be a blocker. Poorly formatted code can make
1151 | understanding of it extremely difficult. Do everything to reduce your cognitive
1152 | load. Real-world example:
1153 | 
1154 | ```swift
1155 | let expectedRemainingLoops = Int(ceil( (expectedRemainingElements - Double(currentRemainingElementsForLoop)) / Double(PPENumberOfTasksInCurrentLoop) ))
1156 | ```
1157 | 
1158 | reads much better if
1159 | 
1160 | ```swift
1161 | let expectedRemainingLoops =
1162 |   Int(
1163 |     ceil(
1164 |       (expectedRemainingElements - Double(currentRemainingElementsForLoop)) /
1165 |       Double(PPENumberOfTasksInCurrentLoop)
1166 |     )
1167 |   )
1168 | ```
1169 | 
1170 | ### Simplifying Complex Feature Branches
1171 | 
1172 | When working on a non-trivial feature branch, consider breaking it down into its
1173 | core functionality while separating any trivial or unrelated changes that can be
1174 | integrated independently.
1175 | 
1176 | A complex branch can often become more manageable, or even medium in scope, when
1177 | distilled into its essential parts and split into smaller, separate changes. In
1178 | some cases, breaking it down properly can eliminate the complexity entirely,
1179 | leaving only straightforward, incremental updates.
1180 | 
1181 | ### The Moving and Changing Anti-pattern
1182 | 
1183 | A great anti-pattern that complicates code reviews is creating a changeset that
1184 | involves both moving and changing things at the same time. This obscures the
1185 | diffs in the version control system, making it harder to track changes. The
1186 | solution: isolate moving and changing into separate commits or separate PRs.
1187 | 
1188 | ### Avoid Plural Names For Classes
1189 | 
1190 | Classes should represent a single entity or concept. Naming a class in the
1191 | plural form (e.g., `Users`) can confuse its responsibility, making it seem like
1192 | it manages multiple instances. Instead, use singular names (e.g., `User`) and
1193 | handle collections separately, such as in a `UserList` or `UserRepository`. This
1194 | ensures clear, focused class responsibilities.
1195 | 
1196 | ### Fast Programming and Slow Programming
1197 | 
1198 | This can be viewed as prototype vs. maintenance programming. Fast Programming is
1199 | crucial for rapid progress and is often encouraged by the business. However, it
1200 | rarely allows time to learn from mistakes due to the tunnel vision and "straight
1201 | ahead" thinking that often accompany it. Slow Programming, on the other hand,
1202 | has the virtue of reflection and deeper analysis, but it tends to be too slow to
1203 | launch a business from scratch. Business leaders typically start to appreciate
1204 | Slow Programming only when they hit the wall of complexity, realizing the need
1205 | for proper design.
1206 | 
1207 | ### Stable Components
1208 | 
1209 | Stable Components is a resort of a Maintenance Programmer. One way for a
1210 | developer to survive in a large legacy project is to create stable components or
1211 | extract them out of existing mess of code. Stable component most likely means a
1212 | testable component: it can be a parsing module or API layer or string
1213 | manipulation helpers. Having such islands of stability helps a lot to overcome
1214 | the difficulties of a maintenance programming. See also Periphery and Prima
1215 | Materia Heuristics.
1216 | 
1217 | ### Boring Code
1218 | 
1219 | (from [Writing Solid Code](http://writingsolidcode.com/))
1220 | 
1221 | > If your code feels tricky, that's your gut telling you that something isn't
1222 | > right. Listen to your gut. If you find yourself thinking of a piece of code as
1223 | > a near trick, you're really saying to yourself that an algorithm produces
1224 | > correct results even though it is not apparent that it should. The bugs won't
1225 | > be apparent to you either.
1226 | 
1227 | > Be truly clever; write boring code. You'll have fewer bugs, and the
1228 | > maintenance programmers will love you for it.
1229 | 
1230 | ### Boring Code 2
1231 | 
1232 | Complex software is not to be developed and used by average programmers. This
1233 | happens anyway because of production pressures. People say: your mileage may
1234 | vary.
1235 | 
1236 | ### Lack of Knowledge
1237 | 
1238 | Bad code stems from a lack of knowledge, not malice, even though both bad code
1239 | and malice share unawareness as their root cause. Sometimes, it helps to put on
1240 | a "lack-of-knowledge hat" to better understand the intentions behind the code
1241 | you're reading.
1242 | 
1243 | ### Lack of Knowledge II
1244 | 
1245 | An interesting feature of inexperience is that it imposes limits on a software
1246 | system's ability to scale. Software written with unawareness at its core will
1247 | eventually become rigid and nightmarish, to the point where team members start
1248 | avoiding the "dark forest" of its codebase. The natural consequence is that such
1249 | software reaches an upper bound of complexity. Paradoxically, this means that
1250 | someone tasked with re-engineering it will often find its complexity manageable
1251 | in the end.
1252 | 
1253 | ### Goodwill vs Pain
1254 | 
1255 | Much of what we programmers learn over the years comes from pain, not from
1256 | goodwill.
1257 | 
1258 | ## Biases
1259 | 
1260 | ### If It Works, Then It Works Bias
1261 | 
1262 | One of the common cognitive biases in engineering is the assumption that if
1263 | something works, it must be good enough. This belief often surfaces during
1264 | reviews of code, design, or systems that have passed tests or are known to
1265 | function under specific conditions. It takes conscious effort to question
1266 | something that already appears successful.
1267 | 
1268 | But just because something works under one set of constraints does not mean it
1269 | will hold up under others. Often, "it works" simply means "it works here and
1270 | now".
1271 | 
1272 | To counter this bias, reviewers should look beyond surface-level functionality
1273 | and ask:
1274 | 
1275 | - It works with a file of size X. What about 10X or 100X?
1276 | - It works under normal conditions. What about a slow network or high CPU load?
1277 | - It works on Linux. Will it behave the same on embedded hardware?
1278 | 
1279 | This bias also affects how we treat existing systems. A solution that "has
1280 | always worked" may be treated as correct by default, leading to investigations
1281 | based on flawed assumptions and missed problems that emerge under different
1282 | circumstances.
1283 | 
1284 | There's no silver bullet for overcoming this bias. The key is maintaining
1285 | deliberate skepticism and making a habit of viewing solutions from multiple
1286 | angles.
1287 | 
1288 | ### Focusing only on what's most visible bias
1289 | 
1290 | The tendency to concentrate a review or investigation on the most obvious,
1291 | observable, or symptomatic parts of a system, rather than systematically
1292 | considering all potential contributing factors. This can lead to overlooking the
1293 | true root cause, especially if it's hidden in a less familiar or less accessible
1294 | area. Before jumping on a specific part of the problem or solution, first step
1295 | back and consider the bigger picture — which blocks in general might be
1296 | involved. As per the common saying: "Don't look only where there is light". In
1297 | practice, this means listing all possible contributors to a problem in the form
1298 | of a block diagram or any other simple sketch that collects both the symptoms
1299 | and relevant system parts. It can also help to annotate each block with relevant
1300 | properties — for example, in a performance investigation, adding performance
1301 | characteristics per block can highlight which parts are likely causes, not just
1302 | the ones that appear most problematic.
1303 | 
1304 | ### The Fix Bias
1305 | 
1306 | When reviewing a pull request titled "Fixes XYZ", there is a natural tendency to
1307 | trust the new change more than the existing code. This bias arises from the
1308 | assumption that the previous implementation was flawed simply because it is
1309 | being replaced. As a result, one might overlook the consequences of the fix or
1310 | fail to rigorously verify the correctness of the new change.
1311 | 
1312 | To mitigate this bias, it's important to evaluate both the old and new
1313 | implementations with equal scrutiny. Consider questions such as:
1314 | 
1315 | - Is the problem being solved accurately identified?
1316 | - Does the new change address the issue without introducing new problems?
1317 | - Are the trade-offs of this fix justified compared to the original
1318 |   implementation?
1319 | 
1320 | By being aware of this bias, reviewers can ensure a more balanced and thorough
1321 | review process.
1322 | 
1323 | ### Resolving Merge Conflict Bias
1324 | 
1325 | Software engineers frequently resolve merge conflicts, and while this task is
1326 | often trivial, it presents opportunities for introducing subtle bugs. One
1327 | contributing factor is the cognitive bias that favors accepting newly introduced
1328 | changes over preserving existing behavior.
1329 | 
1330 | The conflict markers (`<<< >>>`) used by Git can obscure important details of
1331 | the original code, making it easy to unintentionally discard necessary logic.
1332 | 
1333 | A practical approach to mitigating this risk is to slow down and carefully
1334 | evaluate both conflicting versions. Consider not just the new change, but also
1335 | what might be lost if an existing line or code chunk is removed. Reviewing the
1336 | code in context and testing after resolving conflicts can help prevent
1337 | unintended regressions.
1338 | 
1339 | ## Reliability
1340 | 
1341 | ### Errors are not ok
1342 | 
1343 | Never ignore errors. Presence of errors indicates that you don't understand your
1344 | system well enough and therefore don't have a full control over it.
1345 | 
1346 | An error can be major or minor but it anyway contributes negatively to the
1347 | design and operation of your system and also to your understanding of it (see
1348 | [Periphery](#periphery)).
1349 | 
1350 | Errors typically ignored by developers include:
1351 | 
1352 | - Configuration errors
1353 | - Compiler warnings
1354 | - Build system errors
1355 | - Errors produced by the test suites (flaky tests)
1356 | 
1357 | ### Errors must be understood and described
1358 | 
1359 | Google for `Malfunction 54` for a good example.
1360 | 
1361 | ### Underlying errors shall not be hidden
1362 | 
1363 | If a higher-level error wraps some other underlying error, the information about
1364 | the underdying error shall not be lost. Instead, it should be fully available to
1365 | the higher-level error for error handling, logging, tracing, etc.
1366 | 
1367 | ### Critical errors vs non-critical errors
1368 | 
1369 | Make a clear distinction between critical and non-critical errors on all levels:
1370 | source code, software design, error reporting, documentation.
1371 | 
1372 | ### Assertions are better than no error handling
1373 | 
1374 | When there is no error handling, presence of asserts gives at least some basic
1375 | guarantee that software does not do what it is not supposed to.
1376 | 
1377 | ### Assertions are shortcuts for a proper error handling
1378 | 
1379 | Every assert becomes a proper error handling eventually.
1380 | 
1381 | ### Crash Early
1382 | 
1383 | If you know how to not program defensively in a particular situation go ahead!
1384 | Otherwise make your code to Crash Early to catch bugs as early as possible: use
1385 | sensible assertions and stress edge-cases with tests. See
1386 | [Some notes C in 2016: Code offensively](http://blog.erratasec.com/2016/01/some-notes-c-in-2016.html#.VtGEKBg7T5c)
1387 | and
1388 | [Spotify engineering culture (part 2): "We aim to mistakes faster than anyone else"](https://labs.spotify.com/2014/09/20/spotify-engineering-culture-part-2/).
1389 | 
1390 | ## Testing
1391 | 
1392 | ### Write Tests, Even Bad Ones
1393 | 
1394 | If you do not write tests, you will never learn how to write them. It's better
1395 | to write bad tests than to write none at all.
1396 | 
1397 | ### TDD as a Toolbox
1398 | 
1399 | The ability to do Test-Driven Development (TDD) is not a binary "can or cannot"
1400 | skill. It's about having a wide range of techniques, patterns, tricks, and hacks
1401 | in your toolbox. When you have enough of them, you can test almost anything in a
1402 | reasonable amount of time.
1403 | 
1404 | ### Legacy Code is Code Without Tests
1405 | 
1406 | As Michael Feathers puts it in Working Effectively with Legacy Code, "Legacy
1407 | code is code without tests."
1408 | 
1409 | ### Testing as a Way to Manage Complexity
1410 | 
1411 | In addition to ensuring quality, testing is essential for simulations that help
1412 | manage complexity. If I can test and simulate every aspect of my program, I can
1413 | effectively manage its complexity. However, if there are blind spots – areas
1414 | that are difficult or impossible to test – I lose control over those areas and
1415 | must rely on real users to test in the wild.
1416 | 
1417 | ### Test It to Engineer It
1418 | 
1419 | "If you can't measure it, then it can't be called engineering" (Ivar Jacobson,
1420 | Object-Oriented Software Engineering: A Use Case Driven Approach). We can
1421 | interpret "measure" as "test", with testing serving as both a form of
1422 | measurement and a core part of engineering.
1423 | 
1424 | ### Improve Testability
1425 | 
1426 | Ideally, everything should be testable. If something is difficult to test, it
1427 | often signals a need to improve code quality, toolset, or testing
1428 | infrastructure. With effort, these can be enhanced. If unsure how to test
1429 | something, start with a simple approach: stub everything, simplify the network,
1430 | assert what's necessary, then iterate on refining both the test and the system
1431 | under test (SUT).
1432 | 
1433 | ## Distribution
1434 | 
1435 | ### Provide Basic Test Sequences with Your Product
1436 | 
1437 | If you are a provider of software or hardware, consider going beyond the
1438 | standard "interface control document" (ICD) by including basic test sequences –
1439 | a "Hello World"-type program that allows users to quickly get started with your
1440 | product. Such examples help users bring the system online and get up to speed
1441 | without unnecessary guesswork.
1442 | 
1443 | The lack of clear "Hello World" or how-to documentation is especially prevalent
1444 | in the embedded software industry, where companies often rely solely on ICDs or
1445 | technical reference manuals. This forces end-user software engineers to engage
1446 | in guesswork and reverse-engineer the documentation to figure out how to bring
1447 | up a device. While the industry is gradually improving in this regard, there is
1448 | still a long way to go. By providing a clear and functional "Hello World"
1449 | example with every product, you empower your users and make adoption of your
1450 | product much smoother.
1451 | 
1452 | ### Provide Drivers Alongside Your Hardware
1453 | 
1454 | If you are a hardware provider, consider supplying software drivers with your
1455 | device rather than just a technical reference manual for end-users to decipher
1456 | and implement. As the developer of the device, you understand its functionality
1457 | better than anyone else. By providing ready-to-use drivers, you save your users
1458 | the time and effort of implementing the device's features themselves.
1459 | 
1460 | With some effort on your part, you can significantly improve the adoption of
1461 | your product by making it easier to integrate and use. A smooth setup process
1462 | not only enhances user satisfaction but also reduces the barriers to bringing
1463 | your hardware to market.
1464 | 
1465 | ### Provide Simulators Alongside Your Hardware
1466 | 
1467 | If you supply hardware, consider providing a software simulator that mimics your
1468 | device. This greatly simplifies integration into users' SIL/PIL/HIL setups,
1469 | especially if the target users have access to only a limited number of your
1470 | devices (such as when the device is very expensive).
1471 | 
1472 | For language choice, default to Python, as it is widely used for embedded
1473 | development tools. If performance is critical, a C/C++/Rust simulator is also a
1474 | great option, as these languages integrate well with embedded environments.
1475 | 
1476 | ## Documentation
1477 | 
1478 | ### The Illusion of Easy Documentation
1479 | 
1480 | Good documentation is dry and boring. This can create an illusion that writing
1481 | good documentation is easy when in fact it is not.
1482 | 
1483 | ### Less prose, more structure
1484 | 
1485 | Technical documentation is supposed to focus engineer's attention on achieving a
1486 | given goal such as to build a specific system. It is easier to focus one's
1487 | attention on things that have structure embedded in them compared to things that
1488 | are hidden in several paragraphs of prose. Prose has no structure and that is
1489 | why a reader has to do an extra exercise of creating an order out of what he is
1490 | reading. If the documentation already has an order in it, the reader can spend
1491 | less time for a mental reconstruction of the content and focus on the technical
1492 | facts more easily.
1493 | 
1494 | Some of the important tools that communicate order in technical documentation:
1495 | 
1496 | - Document structure and table of contents
1497 | - Diagrams
1498 | - Tables.
1499 | 
1500 | ### Too Much Structure Overload
1501 | 
1502 | Excessively deep nesting in documents or folder structures can hinder the
1503 | understanding of the overall project or system structure, especially if the
1504 | principles used for organizing the sections lack consistency. Ideally, a good
1505 | structure should be intuitive, or at the very least, the organizational
1506 | principle should be easy to understand and mentally map, facilitating easier
1507 | navigation of the content.
1508 | 
1509 | ### Encyclopedic Document
1510 | 
1511 | An encyclopedic document is created over time as a collection of inputs from
1512 | various ad hoc events, eventually becoming a generic repository of everything.
1513 | These documents often have complex, nested structures and lack a single
1514 | consistent narrative. Reading them feels more like going through a dictionary
1515 | from A to Z rather than following a coherent story. This can make it difficult
1516 | for readers to stay engaged, which might explain why many people shy away from
1517 | reading standards altogether.
1518 | 
1519 | Standards or guidelines are often structured in this encyclopedic way, as they
1520 | aim to encompass all aspects of product development or organizational processes.
1521 | Similarly, requirements specifications can easily take on an encyclopedic form,
1522 | making them hard to navigate and comprehend.
1523 | 
1524 | When creating such documents, it's important to establish a guiding principle
1525 | that helps readers mentally map and navigate the content. Ideally, the document
1526 | should include a unifying narrative or story that makes it easier to follow,
1527 | even if the underlying information is complex or diverse. A clear structure and
1528 | logical flow can transform an overwhelming collection of information into a
1529 | useful and accessible resource.
1530 | 
1531 | ## Meetings
1532 | 
1533 | ### Sound Check
1534 | 
1535 | It's great when everyone joins a meeting on time, but an often-overlooked
1536 | practice is doing a quick sound and video check to ensure everything is working
1537 | smoothly. A good rule of thumb is to join:
1538 | 
1539 | - 5 minutes early for routine meetings.
1540 | - 15–30+ minutes early for important meetings, to handle any technical issues in
1541 |   advance.
1542 | 
1543 | ### Meeting Agenda
1544 | 
1545 | A well-prepared meeting runs smoothly when attendees know what to expect.
1546 | 
1547 | - A strong meeting has a predefined agenda that allows participants to follow a
1548 |   clear execution plan.
1549 | - Is the agenda known in advance?
1550 | - Can you or your team define it?
1551 | - Are there questions or answers that can be prepared beforehand?
1552 | 
1553 | ### Meeting Notes
1554 | 
1555 | Meetings often lack structure, and when no notes are taken, valuable discussions
1556 | can be lost. A better approach is for someone to take ownership of note-taking
1557 | in real-time, ideally on a shared screen so everyone can see what is being
1558 | recorded.
1559 | 
1560 | - If your team owns the agenda, align meeting notes with the planned topics.
1561 | - Structure notes so key points and next steps are clear.
1562 | 
1563 | ### Capturing Meeting Results
1564 | 
1565 | A meeting without tangible outcomes is just an expensive conversation. At a
1566 | minimum, meetings should result in:
1567 | 
1568 | - Action points: tasks, follow-ups, next meetings.
1569 | - Decisions made.
1570 | - Recognized trade-offs.
1571 | 
1572 | Whenever possible, capturing processes or architectures in a diagram is better
1573 | than a simple bullet point. Even if no formal notes are recorded, every
1574 | participant leaves with takeaways and mental models – but written records
1575 | significantly increase the meeting's effectiveness.
1576 | 
1577 | Anti-pattern: Running meetings without documenting useful outcomes, leading to
1578 | wasted time and repeated discussions.
1579 | 
1580 | ### Briefing In
1581 | 
1582 | Before the actual meeting, getting alignment among participants is key, whether
1583 | for internal team discussions or external events like conferences and large
1584 | review meetings. When a team participates in an external meeting, it is crucial
1585 | that everyone is on the same page and presents a unified front, avoiding any
1586 | visible disagreement or misalignment.
1587 | 
1588 | Good questions to determine if a pre-meeting briefing is needed:
1589 | 
1590 | - How many attendees already know what will be presented?
1591 | - Does the content introduce significant innovation that requires prior context?
1592 |   Could too much new information create confusion within the presenting team?
1593 | 
1594 | Common pitfalls:
1595 | 
1596 | - Discussing internal team matters in the presence of external participants.
1597 | - Asking too many unrelated questions that derail the focus of the meeting,
1598 |   particularly when it disrupts team cohesion and diverts attention from the
1599 |   main agenda. This is especially problematic when an individual undermines the
1600 |   shared position of the team by introducing misalignment.
1601 | 
1602 | ### Sharing Screen & Presenting Material
1603 | 
1604 | - Share only the relevant content – close unrelated applications, especially
1605 |   internal company chats, before presenting to an external audience.
1606 | - If you need to access other files or perform actions outside the presentation,
1607 |   unshare your screen first, complete the task, then reshare only the necessary
1608 |   content.
1609 | - If your team is presenting to an external party, align on the materials
1610 |   beforehand to ensure consistency in messaging.
1611 | 
1612 | ## Systems
1613 | 
1614 | ### Good enough is often best
1615 | 
1616 | "Good enough for each part is often best for the whole system." ("The Art of
1617 | Systems Thinking")
1618 | 
1619 | In "Engineering a Safer World", Nancy Leveson discusses how, in air traffic
1620 | control, individual flight paths may not be optimized for each aircraft to
1621 | ensure overall traffic harmony. This approach is necessary because optimizing
1622 | each flight path individually could lead to conflicts and inefficiencies.
1623 | Instead, air traffic control systems manage traffic by coordinating flight paths
1624 | to maintain safe separation between aircraft, ensuring the overall safety and
1625 | efficiency of the airspace.
1626 | 
1627 | ### Designing Systems for Effective Work
1628 | 
1629 | - "Rather than trying to find extraordinary people to do a job, design the job
1630 |   so that ordinary people can do it well." ("The Art of Systems Thinking")
1631 | 
1632 | > ...No one comes to work to do a bad job, but the structure of the system may
1633 | > make good work impossible. If management falls into the blame trap, they may
1634 | > fire the offending individual and hire someone else - who may do no better.
1635 | > Rather than trying to find extraordinarypeople to do a job, design the job so
1636 | > that ordinary people can do it well. It is the structure of the system that
1637 | > creates the results. For better results, change the structure of the system.
1638 | 
1639 | ### The Risk of Default Outcomes
1640 | 
1641 | Unresolved trade-offs, especially those that persist over long periods, can be
1642 | risky. Decisions left undecided, such as whether to build or buy critical
1643 | hardware, will not remain open forever. Instead, they tend to resolve themselves
1644 | by default, often in bad ways, whether due to inertia, external pressures, or
1645 | short-term needs. Like a coin that always falls on a side, an undecided
1646 | trade-off will eventually land on an outcome which might not align with
1647 | strategic goals.
1648 | 
1649 | To mitigate this risk, individuals, managers, teams, and organizations should
1650 | proactively track and resolve open decisions, ensuring that critical choices are
1651 | made deliberately rather than by default. Tools such as an Open Questions Log or
1652 | a Risk Registry can support the structured resolution of such trade-offs.
1653 | 
1654 | ## People and Organizations
1655 | 
1656 | ### Everyone is busy
1657 | 
1658 | Everyone is busy, including you. The development of software products often
1659 | takes place in rushed environments, where everyone is focused on achieving
1660 | specific goals without having time to do things properly or fully explore all
1661 | the options for what is being built.
1662 | 
1663 | How about QA? A company may have a dedicated QA department, or even Safety &
1664 | Reliability teams in addition. They are most likely also busy, focusing on the
1665 | most critical tasks to the point that they probably don't have enough time to
1666 | interact with development teams, understand the real requirements, or provide
1667 | 100% coverage and a complete assessment of the project scope.
1668 | 
1669 | Is it a problem that everyone is busy? Given its ubiquity, it doesn't seem so.
1670 | Some people even seem to thrive on being busy all the time. Organizations appear
1671 | to care little about "busyness" itself. What really matters is whether the busy
1672 | person or department can deliver results according to the schedule or whether
1673 | something left uncovered by the busy teams could create serious problems for the
1674 | business.
1675 | 
1676 | One unfortunate observation is that it usually takes significant time before the
1677 | uncovered issues are revealed and addressed from the top down. During this
1678 | incubation period, enough money is often lost, a number of unhappy customers
1679 | accumulate, and other losses may occur, depending on the type of project.
1680 | 
1681 | Or, busy people themselves get tired... and create new methods and tools.
1682 | Sometimes, a new tool can eliminate much of the effort required to achieve a
1683 | goal, or it simply allows a busy person to focus on "what is most important"
1684 | rather than covering everything.
1685 | 
1686 | ### Solving Problems with Cash
1687 | 
1688 | Every engineering problem can be solved with an infinite amount of cash.
1689 | 
1690 | ### The Paradox of Rushing in Software/Systems Engineering
1691 | 
1692 | Attempting to accelerate development often leads to greater delays. In highly
1693 | complex systems, skipping thorough validation, testing, or review processes can
1694 | result in unforeseen issues, requiring extensive rework and ultimately
1695 | prolonging the timeline beyond what a steady, methodical approach would have
1696 | taken.
1697 | 
1698 | There is one [parable](https://howtopracticezen.org/Advanced%20Zen/) that sounds
1699 | like this:
1700 | 
1701 | > Zen teachers often tell the story of a young monk who asked a Zen master:
1702 | >
1703 | > "How long will it take me to attain enlightenment?" The master thought for a
1704 | > few moments and replied: "About ten years." The young monk was upset and said:
1705 | > "But you are assuming I am like the other monks, and I am not. I will practice
1706 | > with great determination." "In that case", replied the Master, "twenty years."
1707 | 
1708 | and a [similar one](https://martialarts.stackexchange.com/a/7133/7133):
1709 | 
1710 | > ... "But if I work hard, how many years will it take to become a master?"
1711 | > persisted the youth.
1712 | >
1713 | > "Oh, maybe thirty years", said Banzo.
1714 | >
1715 | > "Why is that?" asked Matajuro. "First you say ten and now thirty years. I will
1716 | > undergo any hardship to master this art in the shortest time!"
1717 | >
1718 | > "Well", said Banzo, "in that case you will have to remain with me for seventy
1719 | > years. A man in such a hurry as you are to get results seldom learns quickly."
1720 | 
1721 | ### Four seasons
1722 | 
1723 | It is an amusing analogy: like a year starts with a spring and ends with a
1724 | winter, a similar lifecycle can be observed in a growth of organizations.
1725 | 
1726 | Spring is a young company, a handful of people. Not much structure, no strict
1727 | policies, a startup atmosphere. Not yet a fixed income, but probably investments
1728 | or lack of them. More full-stack people with broad expertise. Spring is like a
1729 | village. Colleagues are fellow villagers.
1730 | 
1731 | A Summer is a Spring that made it, a company that is flourishing. Exponential
1732 | growth, more people are hired, extremely steep curve of everything: the
1733 | development of the company structure, more departments, more specialization. The
1734 | philosophy of the company is no longer about "finding its way" but rather
1735 | accelerating on what made a transition from Spring to Summer possible.
1736 | 
1737 | Autumn is already a company with legacy. The source of income is known and
1738 | stabilized. The responsibilities are defined. Less or no people are busy with
1739 | defining a product anymore but more people are busy with the optimization:
1740 | improving product, doing sales and increasing revenues.
1741 | 
1742 | Winter is a dangerous phase. The company has been making profit and doing its
1743 | best by exhausting what was known to work well. At this point, the structure of
1744 | the company is the most fixed and therefore the least resilient. The company may
1745 | cease to exist because there are younger and more adequate competitors or it can
1746 | find a way to renew itself and make it into a new year.
1747 | 
1748 | Another interesting observation is that a transition from season to season
1749 | almost never goes smoothly – in order to accomodate for change, the company has
1750 | to adapt and this very often happens with a good deal of destruction and
1751 | restructuring (see Prima Materia heuristic). Dropping what does not work and
1752 | keeping or creating what does might be crucial for such a transition. Not all of
1753 | the Spring companies make it into Summer. Not all of the companies end up being
1754 | Winter. Not all of the companies can survive their deep Winter.
1755 | 
1756 | One particular management mistake that can be made is trying to apply the best
1757 | practices of a season A to a season B if the season B is too early or already
1758 | too late for such an application. Example: imposing a strict top-down style of
1759 | management on a company of 5-10 people working in a flat hierarchy and making
1760 | them to adhere to the reporting lines might be extremely inadequate as well as
1761 | expecting a fully flat hierarchy to work in an Autumn-like business.
1762 | 
1763 | Not only we can match seasons and companies, we can also match seasons and
1764 | personalities:
1765 | 
1766 | - Autumn is too boring for spring people who value creativity and individual
1767 |   contribution over hierarchies and defined processes.
1768 | - For Autumn people, the Spring is too chaotic and unstructured. Working for a
1769 |   Spring company is inherently unsafe: the younger the company, the less
1770 |   guarantees it can provide to its employees.
1771 | - It may not be optimal for a company to have too many people who represent an
1772 |   incompatible season. It can be damaging for a person to get stuck working at a
1773 |   company that does not match their season type. In such cases, a person who
1774 |   found a matching season can be compared to a fish that found its water.
1775 | 
1776 | See also Kent Beck's
1777 | [The Product Development Triathlon](https://medium.com/@kentbeck_7670/the-product-development-triathlon-6464e2763c46).
1778 | His 3 phases: Explore-Expand-Extract can be loosely mapped to the
1779 | Spring-Summer-Autumn seasons.
1780 | 
1781 | ## Standards
1782 | 
1783 | ### Idealized standards vs. practical implementation
1784 | 
1785 | Standards provide an idealized or encyclopedic view of how systems should
1786 | function and how products should be developed. Frequently, a standard represents
1787 | the combined inputs of multiple companies, making it more extensive than what
1788 | any single company might realistically implement. For most companies,
1789 | implementing a standard is a "best effort" exercise.
1790 | 
1791 | Some standards are practical only for larger companies and can be
1792 | counterproductive or harmful for smaller organizations attempting to implement
1793 | them. Recognizing this, some standards explicitly account for a company's
1794 | maturity level and offer recommendations on which parts to implement at
1795 | different stages of development.
1796 | 
1797 | ### The challenge of standards implementation
1798 | 
1799 | Implementing standards and managing their results within an organization can be
1800 | difficult and complex. However, without any standards, everything becomes 10 to
1801 | 100 times harder and more chaotic.
1802 | 
1803 | ### Standards and best practices
1804 | 
1805 | Standards seek out best practices, collect them, and generalize them.
1806 | 
1807 | ### Standards favor good practice
1808 | 
1809 | Standards favor good practices. If a company has adopted a practice that is not
1810 | yet conventional but makes sense and adds value, it is unlikely that this
1811 | practice would be rejected or deemed inappropriate by any standard.
1812 | 
1813 | ### Wrong is worse than early or incomplete
1814 | 
1815 | Sometimes it is worse to be wrong than to be early or lack information. The
1816 | context: passing the project review milestones required by standards.
1817 | 
1818 | ## Requirements
1819 | 
1820 | ### One-stop shopping
1821 | 
1822 | > "One-stop shopping" is a useful requirements writing priciple. Simply, people
1823 | > reading the requirements should be able to get all the information they need
1824 | > from one document or from one section of a document. They should not have to
1825 | > jump between different sections to understand the requirement. (Patterns for
1826 | > Effective Use Cases by Steve Adolph et al., Chapter 7.1)
1827 | 
1828 | ## Safety
1829 | 
1830 | ### Safety does not exist without blood, loss or failure
1831 | 
1832 | Safety is not there from the very beginning. A gloomy poet could say that safety
1833 | blooms on blood. Safety does also not exist on its own: you first need to build
1834 | something that kills people or causes a loss, then some people will bother to
1835 | learn from this and take actions. Only then safety gets recognized and truly
1836 | appreciated.
1837 | 
1838 | Consequence: safety is especially sound for those folks who have some experience
1839 | of dealing with blood, loss or failure.
1840 | 
1841 | ### Safety is boring
1842 | 
1843 | When implemented well enough, safety becomes boring. Everything is working, no
1844 | one complains. At that moment, it is easier than ever to forget about why the
1845 | safety is there in the first place. Example: how often do we bother to look at
1846 | the safety manuals? Does it mean that the safety is there?
1847 | 
1848 | ### Safety is very hard to achieve but is very easy to lose
1849 | 
1850 | Safety is the extremely fragile and sensitive property of the systems. It so
1851 | much effort that is put into achieving it and still it is so easy to let the
1852 | whole system get down. Some of the very popular reasons for the failure are:
1853 | 
1854 | - degradation of existing components
1855 | - changes to the system that do not take the current system's behavior into
1856 |   account
1857 | - new unexpected factors coming outside the system boundary
1858 | 
1859 | Consequence: safety requires continuous and intelligent effort.
1860 | 
1861 | ### Success breeds failure
1862 | 
1863 | Handbook of Walkthroughs, Inspections, and Technical Reviews, p.412:
1864 | 
1865 | > ... however, we have to anticipate that we will in fact succeed once in a
1866 | > while - and we must also anticipate what that success will bring. For
1867 | > instance, one error-riddled system was seldom used by its several hundred
1868 | > potential users, so management decided to mount an effort to have the system
1869 | > repaired in a systematic fashion. The resulting system was so dependable and
1870 | > useful that usage suddenly increased by a factor of a thousand over previous
1871 | > usage. This increase in transaction volume made the file design of the system
1872 | > completely inadequate to the daily load - which soon meant that nobody could
1873 | > get results fast enough to be useful. The entire problem - and so many others
1874 | > like it - could have been avoided if the review group had only considered that
1875 | > unavoidable law of nature: **Success breeds failure**. So, ..., be prepared
1876 | > for the inevitable reaction. If you start making systems better, your users
1877 | > will want more of the same - the best side effect of all.
1878 | 
1879 | ### Safety as a Defensive Discipline
1880 | 
1881 | Safety is often seen as a defensive discipline, in contrast to fields focused on
1882 | creation, innovation, and action, which drive progress. While these fields push
1883 | forward with new ideas and developments, safety functions as a secondary,
1884 | backing force. Its role is to prevent harm, minimize risks, and ensure that
1885 | these actions happen within a secure framework. Safety doesn't seek to lead the
1886 | charge but to protect and enable other processes to unfold without catastrophic
1887 | failure.
1888 | 
1889 | However, the drive to "lead the charge" often means safety is ignored or
1890 | sidelined until it's too late. In this way, safety acts like a belt that holds
1891 | uncontrolled progress together, preventing it from falling apart when the
1892 | inevitable risks are not properly addressed.
1893 | 
1894 | ### Safety for Engineering is Like Medicine for People
1895 | 
1896 | Medicine isn't the most exciting thing, and no one wants to spend all their time
1897 | thinking about it. But it's clear that humanity can't thrive without it, even
1898 | with all the amazing achievements of civilization.
1899 | 
1900 | In the same way, organizations focus on building things that work and often
1901 | don't think much about safety or quality as long as things are fine and
1902 | customers are happy. But over time, they may realize that the "health" of their
1903 | products, teams, and development processes also matters.
1904 | 
1905 | How safety and quality are handled depends a lot on experience and knowledge.
1906 | Not long ago, amputation was seen as the best way to treat many illnesses. This
1907 | shows how much we've learned and how practices improve over time. Engineering
1908 | also needs to grow in this way, moving beyond quick fixes to create stronger,
1909 | longer-lasting solutions.
1910 | 
1911 | ### User Interfaces and Critical Systems
1912 | 
1913 | Too much simplicity can be a problem. Overly simplistic interfaces may prevent
1914 | operators from engaging their brains fully, which could negatively impact their
1915 | performance in critical situations. If an interface is too simple, operators can
1916 | fall into automatism, executing the wrong action due to a lack of alertness.
1917 | There are serious concerns that software and interface designers should
1918 | prioritize preventing user mistakes, rather than focusing solely on aesthetics.
1919 | 
1920 | ## Books
1921 | 
1922 | - [The Art of Systems Thinking](https://www.google.de/search?q=the+art+of+systems+thinking+book&oq=the+art+of+systems+thinking+book)
1923 | 
1924 | ## Similar resources
1925 | 
1926 | - [Kent Beck - Mastering Programming](https://www.facebook.com/notes/kent-beck/mastering-programming/1184427814923414/)
1927 | - [Heuristics of Software Testability](http://www.satisfice.com/tools/testable.pdf)
1928 | - [The Law of Leaky Abstractions](https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/)
1929 | - [Lessons Learned in Software Development](https://henrikwarne.com/2015/04/16/lessons-learned-in-software-development/)
1930 | 
1931 | ## Copyright
1932 | 
1933 | Copyright (c) 2015-2025 Stanislav Pankevich s.pankevich@gmail.com.
1934 | 


--------------------------------------------------------------------------------
/tasks.py:
--------------------------------------------------------------------------------
 1 | # Invoke is broken on Python 3.11
 2 | # https://github.com/pyinvoke/invoke/issues/833#issuecomment-1293148106
 3 | import inspect
 4 | import os
 5 | import re
 6 | import sys
 7 | from typing import Optional
 8 | 
 9 | if not hasattr(inspect, "getargspec"):
10 |     inspect.getargspec = inspect.getfullargspec
11 | 
12 | import invoke  # pylint: disable=wrong-import-position
13 | from invoke import task  # pylint: disable=wrong-import-position
14 | 
15 | # Specifying encoding because Windows crashes otherwise when running Invoke
16 | # tasks below:
17 | # UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd'
18 | # in position 16: character maps to <undefined>
19 | # People say, it might also be possible to export PYTHONIOENCODING=utf8 but this
20 | # seems to work.
21 | # FIXME: If you are a Windows user and expert, please advise on how to do this
22 | # properly.
23 | sys.stdout = open(  # pylint: disable=consider-using-with
24 |     1, "w", encoding="utf-8", closefd=False, buffering=1
25 | )
26 | 
27 | 
28 | def run_invoke(
29 |     context,
30 |     cmd,
31 |     environment: Optional[dict] = None,
32 |     warn: bool = False,
33 | ) -> invoke.runners.Result:
34 |     def one_line_command(string):
35 |         return re.sub("\\s+", " ", string).strip()
36 | 
37 |     return context.run(
38 |         one_line_command(cmd),
39 |         env=environment,
40 |         hide=False,
41 |         warn=warn,
42 |         pty=False,
43 |         echo=True,
44 |     )
45 | 
46 | 
47 | @task(default=True)
48 | def list_tasks(context):
49 |     clean_command = """
50 |         invoke --list
51 |     """
52 |     run_invoke(context, clean_command)
53 | 
54 | 
55 | @task
56 | def toc(context):
57 |     run_invoke(context, "doctoc README.md")
58 | 
59 | 
60 | @task
61 | def format(context):
62 |     run_invoke(context, "prettier --write --print-width 80 --prose-wrap always README.md")
63 | 
64 | 
65 | @task(aliases=["l"])
66 | def lint(context):
67 |     format(context)
68 |     toc(context)
69 | 


--------------------------------------------------------------------------------