Transcript
Richardson: Welcome to my talk on dark energy, dark matter, and the microservice architecture. I'm sure many of you watching this presentation are architects. When you're not fighting fires, or developing features, your job is to define and evolve your application's architecture. An application architecture is a set of structures, elements, and relations that satisfy its non-functional requirements. These consist of development time requirements such as testability, and deployability, and runtime requirements such as scalability and availability. These days, one key decision that you must make is deciding between the microservice architecture and the monolithic architecture. Naturally, you ask Twitter, as you might expect, there are lots of opinions, some are more helpful than others. In reality, the answer to this question is that it depends, but on what? What are the criteria that you should consider when selecting an architectural style? In this presentation, I describe how the choice of architecture actually depends upon dark energy and dark matter. These are, of course, concepts from astrophysics, but I've discovered that they are excellent metaphors for the forces or the concerns that you must resolve when defining an architecture, both when deciding between a monolithic architecture and microservices, but also when designing a microservice architecture.
Background & Outline
I'm Chris Richardson. I've done a number of things over the past 40 years. For example, I developed Lisp systems in the late '80s, early '90s. I also created the original Cloud Foundry back in 2008. Since 2012, I've been focused on what eventually became known as the microservice architecture. These days, I help organizations all around the world use microservices more effectively. First, I'm going to explain why you should use patterns rather than Twitter to make technology decisions. After that, you will learn about dark energy and dark matter, which are the metaphors for forces or concerns that you must resolve when making architectural decisions. Finally, I'm going to show how to use dark energy and dark matter forces when designing an architecture.
Suck/Rock Dichotomy vs. Patterns
The software development community is divided by what Neal Ford calls the suck/rock dichotomy. In other words, your favorite technology sucks, my favorite technology rocks. Much of the monolith versus microservices argument is driven by this mindset. A powerful antidote to the suck/rock dichotomy are patterns. They provide a valuable framework for making architectural decisions. A pattern is a reusable solution to a problem occurring in a context and its consequences. It's a relatively ancient idea. They were first described in the '70s by the real-world architect, Christopher Alexander. They were then popularized in the software community by the Gang of Four book in the mid-90s. Christopher Alexander was surprised to find himself invited to speak at software conferences when that happened. What makes patterns especially valuable is their structure. In particular, a pattern has consequences. The pattern forces you to consider both the drawbacks and the benefits. It requires you to consider the issues, which are the subproblems that are created by applying this pattern. A pattern typically references successive patterns that solve those subproblems. Then, finally, a pattern must also reference other patterns, alternative patterns, which are different ways of solving the same problem.
Later on, I will describe some specific patterns. Patterns that are related through the predecessor/successor relationship, and the alternative relationship often form a pattern language. A pattern language is a collection of patterns that solve problems in a given domain. Eight years ago now, I created the Microservices Pattern Language with the goal of helping architects use microservices more appropriately and effectively. On the left are the monolithic architecture and microservice architecture patterns. They are alternative architectures for your application. Every other pattern in the pattern language is the direct or indirect successor of the microservices architecture pattern. They solve the problems that you create for yourself by deciding to use microservices. The pattern language can be your guide when defining an architecture. The way you use it to solve a problem in a given context is as follows. First, you find the applicable patterns. Next, you assess the tradeoffs of each of those patterns. You then select a pattern and apply it. This pattern updates the context and then usually creates one or more subproblems. You repeat this process recursively until you've designed your architecture.
Monolithic and Microservice Architecture Pattern
I now want to describe the first two patterns, the monolithic architecture pattern, and the microservice architecture pattern. These two patterns are alternative solutions to the same problem. They share the same context and the same forces. Let's start by looking at the context. The context is the environment within which you develop modern applications. Today, software is eating the world. It plays a central role in almost every business in every industry. The world is crazy, or more specifically, it's volatile. It's uncertain. It's complex and ambiguous. In order to be successful, a business must be nimble.
This means that IT must deliver the software that powers the business rapidly, reliably, frequently, and sustainably. IT needs to be structured as a loosely coupled network of small teams practicing DevOps. This has big implications for architecture. In order for a network of small autonomous DevOps teams to deliver software rapidly, frequently, reliably, and sustainably, you need an architecture with several key qualities. For example, the authors of "Accelerate" describe how testability, deployability, and loose coupling are essential. In addition, if you are building a long-lived application, you need an architecture that lets you incrementally upgrade its technology stack.
How to Define an Architecture
I now want to talk about the dark energy and dark matter forces which are a refinement of these architectural quality attributes. These forces are central to the process that I like to use to design a microservice architecture. The first step distills the application's requirements into system operations. A system operation models a request that the application must handle. It acts upon one or more business entities or DDD aggregates. The second step organizes those aggregates into subdomains. A subdomain is a team-size chunk of business functionality. You might also call it a business capability. For example, in a Java application, a subdomain consists of Java classes organized into packages. Each subdomain is owned by a small team. The third step groups the subdomains to form services and designs the system operations that span multiple services. This design process group subdomains to form services. The key decision that you must make is whether a pair of subdomains should be together or separate. I use the metaphors of dark energy and dark matter to describe the conflicting forces that you must resolve when making this decision. Dark energy is an antigravity that's accelerating the expansion of the universe. It's a metaphor for the repulsive forces that encourage you to put subdomains in different services. Dark matter is an invisible matter that has a gravitational effect on stars and galaxies. It's a metaphor for the attractive forces that encourage you to put subdomains together in the same service. These forces are actually a refinement of the architectural quality attributes I described earlier, deployability, testability, along with loose coupling, and so on.
Repulsive Forces - Subdomains in Different Services
Let's first look at the repulsive forces that encourage decomposition. The first dark energy repulsive force is simple services. Services should be as simple as possible and have as few dependencies as possible. This ensures that they are easier to understand, develop, and test. We should therefore minimize the number of subdomains that are grouped together to form a service. The second dark energy force is team autonomy. Team autonomy is an essential aspect of high-performance software delivery. Each team should be able to develop, test, and deploy their software independently of other teams. We should therefore avoid colocating subdomains that are owned by different teams together in the same service. The third dark energy repulsive force is fast deployment pipeline. Fast feedback from local testing from the deployment pipeline and from production is essential. The deployment pipeline should build, test, and begin deploying a service within 15 minutes. It must be possible to test services locally. We should therefore minimize the number of subdomains that are grouped together to form a service. We should avoid mixing subdomains that can't be tested locally with those that can.
The fourth dark energy force is, support multiple technology stacks. An application often needs to use multiple technology stacks. For example, a predominantly Java application might use Python for machine learning algorithms. The need for multiple technology stacks forces some domains to be packaged as separate services. What's more, it's easier to upgrade an application's technology stack if it consists of multiple small services. That's because the upgrade can be done incrementally, one service at a time. Small upgrade tasks are less risky, and much easier to schedule than a big bang upgrade. In addition, it's much easier to experiment with new technology stacks if the services are smaller. The fifth dark energy repulsive force is, separate domains by their characteristics. It's often beneficial to package subdomains with different characteristics as separate services. These characteristics include security requirements, regulatory compliance, and so on. For example, it's often easier and cheaper to scale an application if subdomains with very different resource requirements are packaged as separate services. For example, a subdomain that needs GPUs must run on an EC2 instance type that's eight times the cost of a general-purpose instance. You do not want to colocate it with services with different resource requirements. It could be extremely costly and wasteful if you did. Similarly, you can increase availability and make development easier by packaging business critical subdomains as their own services that run on dedicated infrastructure. For example, in a credit card payment application, business critical operations are those for charging credit cards. That's how money is made. Functionality such as merchant management, while it's important, is far less critical. You should therefore avoid packaging those subdomains together. Those are the dark energy forces that encourage decomposition.
Attractive Forces - Subdomains in Same Service
Let's now look at the dark matter forces that resist decomposition. They are generated by the system operations that span subdomains. The strength of those forces depends on the operation and the subdomains that it references. The first dark matter attractive force is simple interactions. An operation's interactions should be as simple as possible. Ideally, an operation should be local to a single service. That's because a simple local operation is easier to understand, maintain, and troubleshoot. As a result, you might want to colocate subdomains in order to simplify an operation. The second dark matter force is efficient interactions. An operation's interactions should also be as efficient as possible. You want to minimize the amount of data that's transferred over the network, as well as the number of round trips. As a result, you might want to colocate subdomains in order to implement an operation efficiently. The third force is, prefer ACID over BASE. System operations are best implemented as ACID transactions. That's because ACID transactions are a simple and familiar programming model. The challenge, however, is that ACID transactions do not work well across system boundaries. An operation that spans services cannot be ACID, and must use eventually consistent transactions, which are more complicated. As a result, you might want to colocate subdomains in order to make an operation ACID.
The fourth dark matter force is, minimize runtime coupling. An essential characteristic of the microservice architecture is that services are loosely coupled. One aspect of loose coupling is loose runtime coupling. Tight runtime coupling is when one service affects the availability of another service. For example, service A cannot respond to a system operation request until service B responds to it. As a result, you might want to colocate subdomains in order to reduce the runtime coupling of an operation. The fifth dark matter force is, minimize design time coupling. The other aspect of loose coupling is loose design time coupling. Tight design time coupling is when changes to one service regularly requires another service to change in lockstep. Frequent lockstep changes are a serious architectural smell, because it impacts productivity, especially when those services are owned by different teams.
Some degree of coupling is inevitable when one service is a client of another. The goal should be to minimize design time coupling by ensuring that each service has a stable API that encapsulates its implementation. However, you cannot always eliminate design time coupling. For example, concepts often evolve to reflect changing requirements, especially in a new domain. If you put two tightly coupled subdomains in different services, then you will need to make expensive lockstep changes. For example, imagine that the customer service API regularly changes in ways that break the order service. In order to support zero downtime deployments, you will need to define a new version of the customer service API and support both the old and the new versions until the order service and any other clients have been migrated to the new version.
One option for handling tightly coupled subdomains is to colocate them in the same service. This approach avoids the complexities of having to change service APIs. The other option is to colocate the two services within the same Git repository. This eliminates the complexity of needing to make changes across multiple repositories. However, you might still have issues with service API management. As you can see, the dark energy and dark matter forces are in conflict. The dark energy forces encourage you to have smaller services. The dark matter forces encourage you to have large services, or in fact a monolith. When you are designing a microservice architecture, you must carefully balance these conflicting forces. Some operations as a result will have an optimal design, whereas others might have a less optimal design. It's important to design your architecture starting with the most critical operations first, so that they can have the optimal design. The less important operations might perhaps have lower availability, or higher latency. You must make these tradeoffs.
Dark Energy and Dark Matter in Action
Now that you've had a tour of the dark energy and dark matter forces, I want to look at how well the monolithic architecture and the microservice architecture resolve these forces. The monolithic architecture is an architectural style that structures the application as a single deployment unit. A Java application would, for example, consist of a WAR file or an executable JAR. There are two main ways to structure a monolith. The first is a traditional monolith, which has a classic three-layer architecture. While the structure is simple and familiar, it has several drawbacks. For example, ownership is blurred and every team works on every layer. As a result, they need to coordinate their efforts. The second option is a modular monolith, which is shown on this slide, each team owns a module, which is a vertical slice of business functionality, presentation logic, persistence logic, and domain logic. A major benefit of the modular monolith is the code ownership is well defined, and teams are more autonomous.
The microservice architecture is an architectural style that structures the application as a set of loosely coupled, independently deployable components or services. Each service in a Java application would be a WAR file or an executable JAR. Since a service is independently deployable, it typically resides in its own Git repository, and has its own deployment pipeline. A service is usually owned by a small team. In a microservice architecture, an operation is implemented by one or more collaborating services. Operations that are local to a single service are the easiest to implement. However, it's common for operations to span multiple services. As a result, they must be implemented by one of the microservice architecture collaboration patterns, Saga, API composition, or CQRS.
Choice of Architectural Styles
Let's look at how these architectural styles resolve the dark energy and dark matter forces, starting with the monolithic architecture. Because the monolithic architecture consists of a single component, it resolves the dark matter attractive forces. Whether it resolves the first three dark energy forces depends on the size of the application, and the number of teams that are building the application. That's because as the monolith grows, it becomes more complex, and it takes longer to build and test. Even the application startup time can impact the performance of the deployment pipeline. Also, as the number of teams increases, their autonomy declines, since they're all contributing to the same code base. For example, even something as simple as Git pushing changes to the code repository can be challenging due to contention. Some of these issues can be mitigated through design techniques such as modularization, and by using sophisticated build technologies, such as automated merge queues, and clustered builds.
However, ultimately, it's likely that the monolithic architecture will become an obstacle to rapid, frequent, and reliable delivery. Furthermore, the monolithic architecture cannot resolve the last two dark energy forces, it can only use a single technology stack. As a result, you need to upgrade the entire code base in one go, which can be a significant undertaking. Also, since there's a single component, there's no possibility of segregating subdomains by their characteristics. The monolith is a mixture of subdomains with different scalability requirements, different regulatory requirements, and so on. In contrast with the microservice architecture pattern, the benefits and drawbacks are in some ways flipped. The pattern can resolve the dark energy repulsive forces, but potentially cannot resolve the dark matter attractive forces. You need to carefully design the microservice architecture, the grouping of subdomains to form services, and the design of operations that span services in order to maximize the benefits and minimize the drawbacks.
Let's look at an example of the architecture definition process. In the architecture definition process that I described earlier, you first rank the system operations in descending importance. You then work your way down the list, grouping the subdomains into services, and designing the operations that span services. Let's imagine that you've already designed the createOrder operation, and you end up with an architecture looking like this. The next most important operation on the list is acceptTicket, which is invoked when the restaurant accepts the ticket. It changes the state of the ticket to accepted, and schedules the delivery to pick up the order at the designated time. The first step of designing this operation is to group its subdomains to form services. However, the kitchen management subdomain is already part of an existing service, but you still need to determine where to place the delivery management subdomain. There are five options. You could assign delivery management to one of the four existing services, or you could create a new service. Each option resolves the dark energy and dark matter forces differently.
For example, let's imagine that you colocated delivery management with kitchen management. This operation makes acceptTicket local to the kitchen service, which resolves the dark matter forces. However, it fails to resolve the dark energy forces. In particular, delivery management is a complex subdomain that has a dedicated team. Putting that team subdomain in the kitchen service reduces their autonomy. Another option is to put delivery management in its own service. This option resolves the dark energy forces. However, it results in the acceptTicket operation being distributed. As a result, there is a risk that this design option might not resolve the dark matter forces. In order to determine whether this option is feasible, we need to design the acceptTicket operation.
Classic Two-phase Commit/XA Transaction, and Saga
When implementing a distributed command, there are two patterns to choose from. The first pattern is a classic distributed transaction, also known as two-phase commit. This pattern implements the command as a single ACID transaction that spans the participating services. The second pattern is the Saga pattern. It implements the command as a sequence of local transactions in each of the participating services. It's eventually consistent. These two patterns resolve the dark energy and dark matter forces differently. 2PC has ACID semantics, which are simple and familiar, but it results in tight runtime coupling. A transaction cannot commit unless all of the participants are available. It also requires all of the participants to use a technology that supports two-phase commit. On the other hand, Sagas have loose runtime coupling. Participants can use a mixture of technology. However, Sagas are eventually consistent, which is a more complex programming model. Interactions are potentially complex and inefficient. The participants are potentially coupled from a design time perspective. Consequently, we need to design the Saga in a way that attempts to resolve the dark matter forces.
There are several decisions that you must make when designing a Saga. You need to define the steps, their order, and compensating transactions. You need to choose a coordination mechanism: choreography or orchestration. You must also select countermeasures which are design techniques that make Sagas more ACID. These decisions determine how well a given Saga resolves the dark matter forces. For example, sometimes orchestration is a better choice because the interactions are simpler and easier to understand. Similarly, one ordering of steps might be more ACID than another. Of course, for a particular grouping of subdomains, it might not be possible for a Saga to effectively resolve the dark matter forces. You must either live with the consequences or consider changing the grouping of subdomains. The acceptOrder operation can be implemented by a simple choreography-based Saga. The first step changes the state of the ticket. It then publishes the ticket, accepted event. This event triggers the second step in the delivery service, which schedules the delivery. What's nice about this design is that it effectively resolves the dark matter forces. For example, it's sufficiently ACID despite consisting of multiple transactions.
Summary
The answer to almost any design question is, it depends. If you're deciding between the monolithic architecture and the microservice architecture, or in the middle of designing a microservice architecture, very often, the answer depends on the dark energy and dark matter forces.
Questions and Answers
Reisz: A lot of times when I'm talking with people, customers, clients, they'll often want to have this desire for a microservice environment. As I start talking to them, there are smells, there are signals, there's things that are pulling them towards or away from microservices, that may not align with what their goal was. How do you talk to them like, top-down, for example, a very top-down directed low autonomy environment? How do you start to talk to companies about some of the organizational changes, cultural changes that come with microservices versus a monolithic architecture?
Richardson: There's just so many different things that can go wrong. Even years ago, I just recognize that one of the antipatterns of microservice adoption was trying to do microservices without changing anything else about your organization. One extreme example of that was you still could only release on a Saturday night, fourth Saturday night. I call that the Red Flag Law. That name comes from the fact that, apparently, in some jurisdictions, when automobiles started appearing in the early 19th century, someone had to walk in front of them with a red flag. It was like a car slowed down to the pace of a pedestrian. Though I'm not sure whether in reality the cars could actually go much faster anyway.
Reisz: Yes, faster horses, not cars.
Richardson: There's a lot of antipatterns. Another example of this is adopting cloud. I think this comes up too where people don't change their organization or their processes, when they adopt cloud, and then they get unhappy about cloud because it's not fulfilling the promise. I see a lot of analogies there.
Reisz: How do you, in your mind, decide when you should use choreography versus orchestration when it comes to services?
Richardson: The tradeoffs, in many ways, are specific to the situation. I even touched on this, where orchestration and choreography can resolve the dark energy, dark matter forces in different ways. One example of that is, with choreography, you can have services listening to one another's events. There's cycles in the design in terms of the dependency graph. Whereas with orchestration, it's like the orchestrator is invoking the APIs of each one of the participants. The dependencies are in a different direction. In many ways, you just have to apply the dark energy, dark matter forces to a given situation and just analyze and figure out what the consequences are.
Reisz: Randy Shoup talks about what got you here won't get you there. He's basically talking about technology and how it evolves through an organization's lifetime, like early startup versus a more enterprise that may evolve their architecture. Are there patterns that you might talk about or identify that companies see in that stage, as going from startup to scale-up, to a larger company, when it comes to microservices and monoliths?
Richardson: The key point there is that, over time, the application's non-functional requirements evolve. The job of the architect is to come up with an architecture that satisfies those non-functional requirements. I think one of the key points to make is that an organization should have architects doing actual architecture on an ongoing basis, and make sure that their architecture is evolving as necessary to meet those non-functional requirements. I think a problem most organizations do run into is they neglect their architecture.
Reisz: Keeping in mind Conway's Law, like the ship your org chart, how does company size affect decisions towards monoliths versus microservices, or does it?
Richardson: Yes. One of the big motivations for using microservices is to enable teams to be autonomous. The general model is each team could have their own service, and then they're free to do whatever they want. If you only have one team, you're a small company with just 6, 8, 9, 10 developers, you don't have an issue with autonomy. You most likely don't need to use the microservice architecture, from a team perspective.
Reisz: What problem are you solving? If teams are stepping on each other in velocity, that's a time you may want to solve it with microservices. There's other reasons you might use it, but not necessarily the team or org structure.
Richardson: I actually tweeted about this, revisiting the Monolith First guideline, which I think is, in general, a good one. If you then look at some of the dark energy forces, there are reasons why you might want to use microservices from the beginning. One very tangible example of that is where you need to use multiple technologies. At least then you'd need a service, at least one service for each of the technology stacks that you're using. Then, if you look at some of the other dark energy forces, there are other arguments to be made possibly for using microservices from the beginning.
Reisz: Cognitive load is a topic that comes up quite a bit in Team Topologies. It comes up with Simon Wardley in Wardley maps. It comes up all over the place when we talk about microservices. When I think about it, I'm trying to phrase the question around dark energy and dark matter and cognitive load. My first inclination was to ask, is cognitive load a force for dark energy, dark matter? I didn't really have an answer. How do you think about cognitive load when it comes to microservices or to a monolith?
Richardson: There's the dark energy force, which is simple components. Then there's the dark matter for simple interactions. Cognitive load, more or less, fits into those two components. If you break your system up into services, and the world of a developer becomes just their service, then, in theory, you've reduced their cognitive load considerably.
Reisz: What are your thoughts on platform teams today and their cognitive load?
Richardson: I think having a platform team is a good idea, in the sense that, say, you're stream-aligned teams. You're implementing actual business logic, can just use the platform to get certain things done. That just seems like a good idea.
Reisz: In self-service, yes. Absolutely.
What are your thoughts with the dark energy, dark matter forces when you apply it to frontends, or microfrontends? Any thoughts on that?
Richardson: It's possible that some of the same considerations apply. My focus is more on the backend, rather than the frontend.
Reisz: Similar concepts, I would think, though, as you break it down.
Do you have any thoughts on Federated GraphQL being used to automate API composition?
Richardson: GraphQL is a good API gateway technology. Because in a sense, it's just API composition on steroids: flexible, client driven. That's quite good. Federated GraphQL is maybe something a little different. I think that's implying that the services themselves actually have a GraphQL API. Maybe there are situations where there's value in a service actually having a GraphQL API. I don't think that should be the default. At least that's how I'm interpreting Federated GraphQL in this context.
Reisz: How do you balance getting quick wins versus starting with the most critical operations first? Is it always, start with the most critical, go for the value first, is that your mindset?
Richardson: This is in the context of the design process for microservices, so the on-paper activity, which actually itself should not be a massive, lengthy process.
Reisz: Any final thoughts, or how would you like to leave the audience as they're going through these exercises. What would you like to leave them as your final thoughts.
Richardson: I think if you're an architect or even just a developer, anyone who actually has to make a decision, the first thing to know is that it depends. Usually, the answer, it depends. Then the second thing you need to know is, what are the criteria that you should use to evaluate the differing options? Sort of like the first two steps towards enlightenment as a software architect. That's one, just general thought. Then in the case of monolith, microservices, I think the dark energy, dark matter forces are what it depends on. Those are the factors to consider primarily. This has been on my mind a lot, just based on recent conversations is, don't automatically assume that you need a microservice to accomplish something. Sometimes just a JAR file, or, quite often, a JAR file is all you need, just the library jump.
See more presentations with transcripts