BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Architectural Trade-Offs: the Art of Minimizing Unhappiness

Architectural Trade-Offs: the Art of Minimizing Unhappiness

Key Takeaways

  • To architect is to be a frustrated perfectionist; a good architecture minimizes this unhappiness by making trade-offs that can be lived with.
  • The main skill in architecting is making trade-offs. These trade-offs reflect the most important and difficult decisions a team will make about its architecture;
  • The impact of architectural trade-off decisions can only be evaluated by building something and testing it, usually in the real world. 
  • Being able to generate reasonable alternatives is important, and much of this comes from experience working on similar problems in similar contexts. This is where experience matters.
  • Getting good at forming hypotheses and running low-cost experiments to evaluate trade-off decisions helps teams make better trade-off decisions. 
     

Software architecture, like life, consists of a series of trade-off decisions made with incomplete information and often under tremendous time pressure. Teams seeking a perfect software architecture are going to be unhappy, but despite its imperfection, the alternatives are worse: brittle, expensive systems that can’t evolve and, eventually, can’t be maintained. Software architectures are driven by Quality Attribute Requirements (QARs), most of which are unknown at the time architectural decisions are made. However, it is nearly impossible to satisfy all of an architecture’s QARs - which leads teams to make trade-offs.

Trade-offs make teams feel bad, like they have failed somehow, but they are a necessary, and even essential, part of getting a solution to market. Software architectures (especially enterprise architectures) are often based on a set of principles and standards that teams want to follow as rigorously as possible. Making compromises that do not follow these principles and standards to deliver a workable system often disappoints team members.

The art of making trade-offs is, in many respects, a matter of minimizing the team’s unhappiness with the architecture. No architecture is perfect, but they can be "good enough". What constitutes "good enough" is worth exploring in more detail.

An analysis of the architectural trade-offs can provide a lot of information about the architecture itself and especially its risks, as demonstrated by the Architecture Tradeoff Analysis Method (ATAM) from CMU/SEI.

Architecting involves finding solutions to unknown problems

As we have noted in other articles, architecting a software system evolves from attempts to make the system satisfy its Quality Attribute Requirements (QARs). What makes this especially challenging is that many of a system’s QARs are unknown, at least initially, and some of the ones that are believed to be known turn out to be wrong. In addition, QARs can be deeply interrelated, as in the case of scalability and performance, so that a complete solution to all QARs is often impossible, hence the need to make trade-offs when making decisions.

QARs are unknown largely because no one knows how successful the system will be at the point when the system is being envisioned; for example, if the system isn’t widely adopted there is no need to make it scale, that is, support large numbers of users, and it sometimes takes time to understand the real need.

QARs turn out to be wrong when they assert that the system needs to achieve something that turns out not to be true, for example when a business sponsor says that the system only needs to support thousands of concurrent users when, in reality, the actual need is ten times this. Or vice-versa when the system never achieves the popularity that would demand high levels of concurrency.

The main skill in architecting is making trade-offs

One might think one of the most important architecting skill is to accurately forecast QARs, given the high cost of getting them wrong. Being able to foresee the future is always useful, although elusive. But even if teams have perfect QARs they are still faced with having to make trade-offs that result from being unable to completely satisfy competing QARs - such as cost and capacity, or performance and scalability, or time-to-market and just about everything else.

The critical skill in making trade-offs is being able to consider two or more potentially opposing alternatives at the same time. This requires being able to clearly convey alternatives so a team can decide which alternative, or neither, acceptably meets the QARs under consideration.

What makes trade-off decisions particularly difficult is that the choice is not clear; the facts supporting the pro and con arguments are typically only partial and often inconclusive. If the choice was clear there would be no need to make a trade-off decision.

A team may need to make a decision between synchronous and asynchronous communication between services, which requires making trade-offs between performance and scalability. A performance QAR could be easier to define than a scalability QAR, especially when a new product is first being rolled out. The requirement for the system’s performance may be clear - for example, the time to process a specific transaction should not exceed xx seconds with nn concurrent users. However, the scalability requirement may be undefined at that time, as it will depend on how successful the product will be. The team may decide to use asynchronous communication between some key services to ensure future scalability but this could negatively impact performance. On the other hand, too much reliance on synchronous communications could negatively affect scalability and create technical debt, should the new product turn out to be very popular.

What often separates architectural decisions from other design decisions is that you can never be sure the architectural decisions are right. Each decision is really just an educated guess, a hypothesis, that must be further explored by building something, releasing it, and gathering feedback. Sometimes, even then, the trade-off remains: it is sometimes impossible to completely satisfy a set of QARs. All you can know is that the decisions you have made are currently adequate.

Making decisions that seem adequate now can lead to unintended consequences over time. Take quantum computing, where vendors decided to focus on usability by hiding complex, critical components of the software stack. As described by Vlad Stirbu, Arianne Meijer–van de Griend and Jake Muff in their ICSA 2024 conference paper called "Exposing the hidden layers and interplay in the quantum software stack", this decision resulted in performance limitations in current qubit implementations.

Releasing is the only way to evaluate trade-offs

No amount of pure analysis is sufficient to evaluate trade-off decisions; real-world feedback is the only way to tell if the trade-off is acceptable (See Figure 1).

Figure 1: Architectural Trade-offs and Releases Feedback Cycle

A simple example illustrates the point: local caching of data can improve response time by eliminating remote data access over a network, but it can also reduce concurrency if caches become out of date, and it can reduce performance if the local caches have to be frequently refreshed. Both of these can vary a lot in a dynamic system. How much to cache and how often to synchronize cache data depends on how often the data is accessed and how often it changes, both of which depend on real-world usage. The only way to evaluate this kind of trade-off is to build and release the part of the system that implements the trade-off and then look at how the system responds to various kinds of load.

Releasing an MVP (minimum-viable product) always involves business trade-offs as well as technical trade-offs. The business trade-offs involve guesses about whether users will find the release appealing and compelling. Meeting tight timelines always involves leaving something out in the hope that what is released is valuable. The organization may be unsure that there is even a market for a particular product. Until they are sure that there is demand for the product, they will only want to build a very limited MVP.

In most cases, releasing an MVP also creates a set of architectural trade-offs, which usually involve guesses about how much architecture is "enough." But if the product is successful then they may have to address scalability and performance issues very rapidly. And even for an MVP, there are some trade-offs which are not negotiable, such as minimum security requirements.

Since the architecture for a system consists of a set of decisions and almost all of these decisions reflect trade-offs, the MVA (minimum-viable architecture) for a release is mostly a set of trade-off decisions. These decisions almost always generate a lot of technical debt. Releasing a system means making trade-off decisions; each release consists of a set of compromises, each of which creates technical debt. If the compromise works out, the technical debt may not need to be resolved, but if and when the compromise becomes unacceptable you’ll need to resolve (repay) the technical debt by rewriting some portion of the system to try a different trade-off.

Reducing technical debt itself is inherently a challenging trade-off and one that many organizations fail to successfully manage. They are under constant pressure to increase the functionality in each successive, incremental MVP, but in doing so they also usually compound the technical debt related to the associated MVAs.

You can’t evaluate trade-offs in technology you don’t understand

Any sufficiently advanced technology is indistinguishable from magic.

― Arthur C. Clarke

Teams who are inexperienced in specific technologies will struggle to make decisions about how to best use those technologies. For example, a team may decide to use a poorer-fit technology such as a relational database to store a set of maps because they don’t understand the better-fit technology, such as a graph database, well enough to use it. Or they may be unwilling to take the hit in productivity for a few releases to get better at using a graph database. In this instance, they are trading the time it would take them to get familiar with a graph database against the technology fitness of a relational database, which may impact scalability, performance, and maintainability, so they are trading time against these QARs.

The solution is not to avoid potentially beneficial technologies but to use incremental releases to learn more about how the technologies work and how to work around their limitations, if possible. Framed as experiments, trying out new technologies when the risk is low can pay off later when the team finds a problem that the technology uniquely solves.

Your trade-off decision only has to be good enough

Sometimes teams struggle to make trade-offs because they know that none of the decisions are perfect. Because a decision is made at a point in time, the good news is that any trade-off only has to be good enough right now. The bad news is that if your system is successful your trade-off decisions probably won’t be good enough for long. The reality is that every decision is compromised and will need to change at some point in the future, but the only way you will ever learn more is to make decisions and then build and release something. You WILL get feedback, and you probably WILL have to respond and adapt. Most developers understand this.

Unfortunately, not all managers have learned this yet; they want to hear that decisions are final and will never change. As a result, the other essential architecting skill is being able to explain the rationale behind trade-offs to managers who can’t (or don’t want to) understand the technical details. This is because of an inherent truth about decisions: they are easier the less you know about the problem. Easier, but usually wrong.

For example, consider the case of localized versus distributed processing. A team may initially choose to design a few large components and run them in the same cloud server to simplify development and deployment and make it easier to get their first release to customers. They suspect this won’t scale well, but they don’t need to scale in the first release; they need to know whether the system is attractive to its potential user community. Later, to deal with scaling issues, they would refactor their architecture into numerous smaller services and distribute them across several containers to leverage elastic scalability.

How to get better at making trade-off decisions

Unfortunately, good judgment comes from experience, most of which comes from bad judgment. There are, however, some ways to improve the learning process:

  • Get better at generating reasonable alternatives. While much of this comes from experience working on similar problems in similar contexts, studying the tradeoffs others have made, along with their results, can help inform your choice of alternatives. This understanding inspires adaptations that fit your context.
  • Don’t expect technology to make decisions for you. While generative AI can’t make tradeoff decisions for you, it can provide inspiration for alternatives. Generative AI is, at present, a masterful copyist and pattern-matcher, but there is no "intelligence" in its artificiality. As Grady Booch points out, large Language Models (LLMs) which are the foundation of Generative AI "do little more than interpolate across a multidimensional space". LLMs may know about tradeoffs other teams and organizations have made, but they do not know your context or constraints. At best they can provide suggestions for alternatives but the choice is up to you.
  • Get good at forming hypotheses and running lowcost experiments that yield results quickly. Being able to quickly evaluate and reject alternatives is important, but you have to do it based on data and not hunches. Being able to devise creative solutions is always a valuable skill, but you also need to find ways to test your ideas quickly and cheaply, so as not to overinvest in an idea that doesn’t yield the results you are seeking.
  • Be willing to admit you were wrong and pivot as quickly as possible. This usually means being good at handling rework and undoing part of the existing solution. Modular solutions help. A "CYA mentality" is fatal to this, but unfortunately, many organizations consider admission of a mistake a failure, and in senior positions, it can be career-ending.
  • Understand the cost of rolling back decisions that later prove unworkable. No matter how hard you try, no matter how good your experiments, you will encounter information at some point that forces you to reverse/rework some prior decisions. Having an understanding of the magnitude of this potential rework at the time you make the tradeoff might cause you to make different choices.

Conclusion

Perfect architectures don’t exist because the QARs that drive their evolution are partially unknown and sometimes inconsistent. As a result, trade-off decisions are inevitable and essential. These trade-offs are never perfect, but they can be good enough for the system to survive and thrive. The main skill teams need to make these trade-offs is to be able to balance the relative benefits and drawbacks of different alternatives to make a choice that at least partially meets the architectural goals (QARs) of the system.

The impact of architectural trade-off decisions can’t be fully evaluated through reviews or discussions; they can only be assessed by building something and testing it, usually under real-life conditions, with real users. Doing so is much more likely to generate unexpected events that will more thoroughly exercise the architecture of the system.

Because teams work under extreme time and cost pressures, being able to quickly generate reasonable alternatives is important to help them make decisions more quickly. This skill usually comes from experience working on similar problems in similar contexts. This is where experience with a variety of approaches pays off. The faster they can create testable alternatives, the faster they can make informed decisions about their alternatives.

Teams who do these things well end up with better architectures. They may still have doubts about some of their trade-offs, but they are at least supported by the best evidence available at the time, and they are better prepared for new information in the future that may force them to reconsider. The teams may not be happy with all of their trade-offs, but their users are happier than they would be otherwise.

About the Authors

Rate this Article

Adoption
Style

BT