Transcript
Skelton: My name is Matthew Skelton. I'm here with my co-author, Manuel Pais.
Pais: I'm Manuel.
Skelton: We're co-authors of this book here, "Team Topologies," published by IT Revolution Press, September 2019. We're very pleased with how the book's gone since publication. Charles Betz from Forrester Research described the book as innovative tools and concepts for structuring the next generation digital operating model.
Outline
We want to talk through, what is business agility? The point of being agile, not doing agile. The value of a product mindset. Then Manuel will share with you some examples of organizations using team topologies to help achieve business and technical agility.
How Team Topologies Help With Business and Technical Agility
First, how does team topologies help with business and technical agility? Team topologies encourages decoupling of business concepts to help make the organization more responsive. Patterns from team topologies help to turn blocking compliance checks into self-service, flow-aligned, API-driven checks. Crucially, team topologies is partly a sense making approach to help organizations gain situational awareness and therefore agility in the wider business context. Team topologies also helps the organization to focus tightly on its core mission via these streams and limiting team cognitive load.
What is Business Agility?
First off, let's just remind ourselves what we mean by business agility. Our point of view is the ability to respond rapidly to changing internal and external conditions. When we say respond rapidly, that's within hours. Something happens outside in the regulatory context or business context, we can make a change and make something happen. Or internally, likewise. Why is this important? What's driving this? The remote-first new way of working, particularly ushered in since the pandemic, but actually this was a trend that was happening over time anyway. The pandemic has brought it forward. The speed of change in technology, in climate, in geopolitical relationships, these are only increasing. We need to be able to respond to these external situations within our organization. Of course, there's global and local competition, which is increasing. We need to be able to respond to that as well.
Digital
What do we mean by this word, digital? There's actually three components to this. There's rapidly developed services that are accessed by personal compute devices: mobile phone, laptop, whatever. There's the telemetry for existing processes that are provided by software and sensors. Telemetry is a second dimension to this. Then, finally, there's highly effective ways of working that we discover from these first two things. It's a combination of these rapidly developed application services for compute devices. It's the telemetry that goes along with them, and these highly effective ways of working. That gives us organizational agility, this business agility that we're looking for. The kind of questions that we should be asking ourselves, if we're heading towards true business agility, are, how would we optimize for a fast flow of change through the organization? How would we make sure we focus on user needs? How would we produce the right thing in the right way at the right time? How would we easily course correct when we need to adjust? How would we maximize our chances of finding new opportunities for innovation? These are all the questions that we should be asking if we're looking to achieve real business agility.
Being Agile, not Doing Agile
Part of that is about being agile, not doing agile. What do we mean by that? It sounds easy. The great thing is, particularly since the State of DevOps reports have been published over the last few years, that we now have a strong body of evidence that points us in the right direction. Going back to 2013, at least, and more recently, the surveys of over 31,000 IT professionals worldwide over many years. This is an independent view into the practices and capabilities that drive high performance. Crucially, the four key metrics that we'll look at, that were identified by the State of DevOps report and the "Accelerate" book, as being strong predictors of high organizational performance.
A whole set of practices that were also identified and essential to make this stuff work, in particular, these four key metrics from the "Accelerate" book. The authors of the "Accelerate" book, of course, being certainly authors of the State of DevOps report: lead time, time it takes from a change to go into production. Deployment frequency, how often we deploy. Meantime to restore, the average time it takes to restore a live production service after an outage. A percentage of changes to live services that fail. By optimizing for these four, and a couple of other things, we start to head towards business agility, by improving our technical practices. Business agility is underpinned and supported by technical agility, and good technical practices.
Some of these include things like we can make sure we're getting fast feedback from deployment pipelines. Things like test driven development. Making sure we've got team ownership of software and services. Crucially, making sure we've got configuration in version control. Expecting our services in software to be very transparent in operation, so observability. The ability to see what's going on inside. We're designing these systems for automation, not for humans to click around. We're making sure that we're re-aligning our architecture for a fast flow of change. Some people call this flow architecture. There's a book that's recently come out. The principle there is we cannot just expect our existing software architectures to remain the same and have a fast flow of change. We need to realign this architecture.
Some practices that seem to work well in this space to help us get this business agility, are domain driven design, and the ideas that come out of DDD. This is untangling business concepts to enable a faster flow of change. There's a lot more to it than that. That's one of the main benefits that we see from organizations adopting DDD. We can combine DDD with Wardley mapping, which is there to increase situational awareness and to help us apply the right techniques in different situations. Should we build it? Should we rent it? Should we buy it? What's the right place to invest our effort and energy based on what's happening in the technological and business landscape?
Other organizations are also combining these two practices, so DDD with Wardley mapping, and they're adding on team topologies. Team topologies with its focus on fast flow of change, rapid feedback from live systems. The way in which teams interact, can be used as a signal. The evolution of the organization and the evolution of the team structures and interrelationships is a key theme in team topologies. Of course, things like limiting team cognitive load, to enable the rapid flow of change. The combination of these three things seems to be very effective for organizations really helping them to have greater business agility.
Rapid Flow of Change
Diving in a little bit more then into some of the key themes around team topologies. We start with the principle, we need a rapid flow of change in the organization. This is a rapid flow of changes to our software systems. These days, of course, the vast majority of tech in the organization that's building modern systems is going to be software driven. We've got infrastructure as code, and so on. We're driving everything through software principles and software practices. It's not just good enough, though, to have a rapid flow of change towards production. We need rapid feedback from these running systems to enable us to course correct. A key principle in team topologies is that handovers from one group to another can kill that flow of change. If we've got multiple groups, all in the flow of change, that really slows down how quickly you can make changes through to the live systems. We need to remove these handovers. That's why we've got our starting point as a stream-aligned team end-to-end responsibility for a particular part of the business domain usually, and they get a rapid feedback from live systems. The stream-aligned team then is supported by three other types of teams, the enabling team, the complicated subsystem team, and the platform team. These three supporting teams are there primarily to limit or reduce the cognitive load on the stream-aligned teams to enable a rapid flow of change.
It's not just about types of team, though, we need to think about the ways in which they interact, just as any system or complex adaptive system even. We need to think about the ways in which different bits of the system interact. That's why we've got interaction modes in team topologies. Collaboration, two teams working together for a defined period of time to achieve a specific outcome. X-as-a-service, one team providing, one team consuming something. Low friction, scales well, and facilitating one team helping another team to do something or bridge a capability gap. It's a combination of team types and team interaction modes that enables us to listen to the signals inside the organization and evolve how we work to increase our business agility.
This is a diagram from the book. It shows a snapshot of a small part of the organization at a particular point in time, representing different kinds of activities happening in order to achieve certain ends. It's an example diagram. A key idea in team topologies is to track dependencies and separate them in terms of whether they're blocking dependencies, or non-blocking. Software has all kinds of dependencies at multiple different levels. That's how we're able to build these kinds of systems. The crucial thing is, does your dependency block you from making a change? Does it block the flow of change, or can you simply self-serve and consume that capability? We need to split. We need to be conscious of these different kinds of dependencies, blocking versus non-blocking, and obviously aim towards making more of our dependencies non-blocking, so that teams are able to self-serve and have a rapid flow of change. If we have these barriers to flow, hand-offs, approval gates, manual inspections, and so on, we need to be aiming to replace with self-service APIs for teams that use them. This of course presents a bit of a challenge particularly in the compliance space from a mode of permitting change to flow through the organization, towards enabling change to flow quickly. That's a huge mindset shift that we've seen with lots of organizations that we've worked with. This is a question in this context, what would be needed for us to be compliant with security or finance or personal identifiable information rules, with multiple, decoupled rapid flows of change? It starts with self-service APIs really.
The Value of a Product Mindset
Let's have a brief look at the value of a product mindset and how that can help with business agility. Marty Cagan, who is part of the Silicon Valley Product Group, identified a product as a holistic user experience: functionality, design, monetization, and content. Let's think about product. If I walk into a city center, or a shopping center, and there are some running shoes in a shop, am I forced to buy those running shoes? No, I can choose to buy them. That's a product. A product is optional to use. No one's forced to use the product. Product is also carefully designed and curated. The experience of that product is curated. Another key aspect of what makes something a product, the thinking and care that's gone into it, to make it work really well. A product also simplifies something for users. This is a photograph of a machine that I took, an early jukebox machine. It's actually sitting at a pier in San Francisco. You put a quarter, or 25 cents in there, and it plays a tune. At the time when this came out, this was like an early version of Spotify. It simplified my life. It means I don't have to carry around a whole stack of 7-inch singles or whatever it was at the time. Products simplify something for its users.
Product also expects to evolve to take advantage of changing technology, and the changing ecosystem. These are all aspects that make a product good that we've got a strong focus on user needs driving good software. We're trying to take these principles of good products and applying it to software inside our organization, particularly for situations where the users of our products are internal. For suppliers where we've got internal software that has been consumed by internal users, we need to think about the experience of those people. We can say the software should get out of the way, we need to design for usability. This is where it's really important for us to apply product management techniques for internal platforms.
Let's return to these product principles and see how this applies to an internal platform. Here's my running shoe example again. It should be optional for me to use this product. That means the platform is optional to use. No team is forced to use it. Internal platform team should effectively use internal marketing techniques, advocate for their platform product and market it to internal teams who are going to use it. Using things like user personas, user experience, and actually going to talk to these teams and see what they need. The purpose of this internal platform should be to reduce the cognitive load on teams that are using it, to enable them to have a fast flow of change. That's one important route to achieving real true business agility. In this context, a platform is a curated experience for engineers, the customers of the platform.
Team Topologies Examples
Pais: I'm going to talk through two examples, in particular, that highlight the ideas of business and technical agility that Matthew has been talking about. We have a number of case studies, some of them were already included in our book. Companies like Adidas, IBM, TransUnion, applying some of the patterns that we talk about in the book, which now provides a more holistic and cohesive view on the practices and patterns and approaches to organizational evolution that works well for many organizations that are looking for more agility. It's been now about 20 months since the book was published. We have some more examples that we've been publishing on the website. There are many companies adopting the ideas in "Team Topologies," in order to achieve higher agility. These are some of them. There are many others that we can't disclose at the moment but we've worked with, or we've talked with, including across many industries, banking, government, telecom, healthcare, and so on. It's really quite agnostic to industry, because all of these organizations these days are mostly driven by software products and services.
1. Footasylum
One of those organizations is Footasylum. They've been around for about 15 years. They grew very quickly. They had physical stores for footwear. Now they have obviously a strong online presence. They have 2500 employees, more or less. What's interesting here is to know that in 2019, they really came up with some major challenges to their agility. They started small. Obviously, at that moment it was quite easy to make changes as they grew. They expanded into different regions of the world. They had more teams. They started to see a lot of fragmentation. They had teams essentially aligned to their geography, development teams in the UK, product management team in the UK, as well. Then other teams in Romania, India, but mostly aligned in a functional way. This was causing a lot of difficulties to actually have a product mindset and be able to make changes quickly.
Even before "Team Topologies" came out, they were looking at, what are the actual streams of work that we have inside the organization? What are the business segments that we actually need to align our teams to, so that we can have that higher business agility? They combined this with Wardley mapping. They found the book "Team Topologies." They were also combining that with Wardley mapping, so that you could have not just a view of what are their streams of work, their products, what is the platform that they need internally, but also, how they expect this to evolve over time? What is the strategy in terms of what should we build? What should we buy or rent? How do we expect these things to evolve over time?
Another quite interesting approach they had in terms of especially these technical agility is that they've been thinking about the platform as a minimum viable product. We call this thinnest viable platform in the book. Essentially, trying to keep their internal platform as small and simple as possible. They have one example which is quite interesting where they needed a service with the locations of physical stores. They first thought, we're going to need an API with a database behind it, and a standard service. Then they realized, actually, this is data that's almost static. It changes very infrequently. Maybe all that we need is a version control data file. That's what they did. They haven't changed that approach. They really took this idea of let's start small, see what is the minimum we need, and then evolve over time. Rather than start big, make a lot of complicated design up front, and then realize if that's needed or not. They kept this idea of agility also within their internal platform.
The clarity in terms of the types of teams and the interactions between teams also helped them, especially because they started from the idea that if we want fast flow, if we want this agility, being able to respond quickly, we shouldn't expect the teams to be static and never change. We should expect to identify when different teams will need to interact, when maybe we need an enabling team to help address some capability gaps. Maybe we need some new services in the platform to address some aspects of the lifecycle that teams need help with. Essentially, they're using the interactions between teams to actually evolve and think about, what do we need to do next to keep being agile, both internally in the platform and in their products to the external customers?
The ideas they took from Team Topologies that worked well include this focus on the streams of work, on what are the actual business domains that should be more or less decoupled, or actually highly decoupled so that we can evolve them independently. Keeping the platform as simple and viable as possible. Evolving the team's interactions and combining with Wardley maps. We have a strategical view of what we expect to become more of a commodity over time, or what we expect to become some new products possibly that we will be developing.
2. Uswitch
The second case study is from Uswitch. They're a comparison website for mostly comparing home utilities, like internet providers, like TCP port providers, making it easy to change. There are two aspects that are quite interesting here. Around 2010, they started moving towards autonomous teams. We're talking here about truly autonomous teams. Each of these teams that you can see here were aligned to one of these utilities that I mentioned. There was one team, for example, focused on energy providers' comparison, and making that available to customers. This idea of autonomy was quite extended. These teams would make all sorts of decisions around infrastructure, around which tools they wanted to use, as well as the experiments around the product and how to evolve it. This worked quite well for a period of time for them.
As they grew, and as their services became more complicated, they realized, and actually, Paul Ingles, the CTO wrote about this, how they were having more cognitive load. More aspects of, in this case, using AWS services that they had to keep in mind in order to make any change. Because of the high cognitive load, they were limited in the agility and the speed that they were able to change things. Paul Ingles said they were tracking actually the direct AWS API calls at that time. This is circa 2015. They were seeing a steady increase as the services got more complicated. This was actually a proxy for how much cognitive load each of these teams had because they had to deal with all this complexity, all these services that they were using directly from AWS. What Paul Ingles realized, is that people are spending too much time interacting with low-level services and spending their time on low-value decisions, compared to high-level user focused decisions.
They introduced the platform for making it easier to have cloud infrastructure services, which were aligned with the needs of the internal teams. Over time, they saw as the adoption of this platform grew, it wasn't mandated, because it was seen as an internal product, not something that everyone had to use. As platform adoption grew, they actually saw these direct AWS calls, the number going down significantly. Again, this was a proxy they were using to understand the load on teams in terms of low-level infrastructure. Then this evolved to other platforms that they have today. This was very much based on internal needs of the teams, not an upfront design of this is our big platform that we're going to strive for. It was actually driven by internal needs, and basically what the internal market was asking from the platform teams.
I actually spoke last year at QCon on Kubernetes, and the fact that it's a foundation to such an internal platform, like they have at Uswitch. There's now an article based on that talk. If you're interested, I go into a lot more detail around what makes a good platform, platform adoption metrics, and so on.
To sum up this case study, Paul basically wanted the teams to organize around these principles, which we are very familiar with in engineering around loosely coupling, highly cohesive systems, but now applied to teams as well. Team topologies helped them and provided the language to talk about that. They look at the platform as a curated experience. An internal product that helps reduce complexity for teams. They actually moved from this idea of fully autonomous teams to the idea that there are some dependencies on the platform team, but because it's based on self-service, those dependencies are not blocking. They're actually helping those teams accelerate and reducing cognitive load.
Summary
Business and technical agility is really about responding rapidly to changing external and internal conditions. We need to have situational awareness, clarity of business purpose, good technical practices, and local decisions to actually be agile. Having a product mindset means we're focusing strongly on user needs, user experience, viability of the product and services. We looked at a couple of real-world examples of applying these ideas. Team topologies helps with this approach and this intent of increased business and technical agility by encouraging the decoupling of business concepts to start with, to make the organization more responsive. Move from blocking compliance and all kinds of checks, into self-service that promotes fast flow and decoupled streams through APIs. Team topology is also about making sense of where we are today and how the organization can gain more awareness and therefore agility, and move faster organizationally as well. Helps focus on the mission of the organization with streams and limiting cognitive load.
Resources
We have a number of free resources on our website, as well as the industry examples that we talked about and more. We have a couple of infographics, if you want to share the key principles or ideas we talked about, and are in the book. There's an upcoming Team Topologies Academy, where we'll have on-demand training for people who want to go more in-depth. There's a partner program we're starting. If you're interested, you can get in touch through, partners@teamtopologies.com. You can sign up for news and tips on our website as well.
Questions and Answers
Shoup: Fabio asks about customer centricity. Being focused on the customer, and when there are business units that are doing different things in parallel to the customer, what's an opportunity for IT, maybe, or team topologies to influence that?
Skelton: I think this relates to, what are the most important domains in the business? Sometimes it's useful to have a fast flow of change aligned to different products. Sometimes it actually goes against what's actually good for the customer, and what's actually good for the business. That speaks to maybe rethinking some of the ways in which the organization has got things aligned, and maybe rethinking what a product means. Sometimes, we actually want to give the impression of a product that's somewhat larger maybe. You might adjust the internal boundaries and how the work flows through the organization, so that you're meeting those customers in a different way. Alternatively, you might keep the separate products separately. Keep this product separate as they currently are, or something similar, but actually have what we would call a platform. It will be a design platform. It will be like a user experience platform, which would help multiple teams to meet the needs of customers in terms of the user experience, without having to think so much about it, without having to being so aware of all the UX details. Instead of like a data or infrastructure platform, we've got a platform that's focused on design or user experience, which then offloads a lot of that cognitive load away from teams. Maybe that platform has a lot of templates or examples of ways of doing things that help to get a coherent customer experience without coupling the architecture and the flows of change.
Pais: I mentioned the book, "Designing Delivery" by Jeff Sussna, where I think there are different levels that you can look at this. Obviously, there's design of your application services, there's the user experience, and there's also what Jeff calls the promises that you're making to your customers. That can be even beyond just a common recognizable user experience. It's actually, what are we promising in terms of how we behave with the customers, and things that should be consistent across multiple products, if at least you're marketing them as coming from the same brand? Inside a large organization you might have different brands. Then they're separate in terms of what they promise the customer. It's a really interesting book. I recommend reading that. Specifically, in "Team Topologies," we talk about service experience teams as having that sense of how different products or services that we offer, how they interact, and how do we make that experience more consistent?
Shoup: The next question was around the difference between platform teams and enabling teams? I'm sure you get this question a lot, and you've made clear the distinction. I think maybe a set of examples are like platform teams are about providing this type of service and enabling are about these others, just so people have that in their heads?
Skelton: We were really keen to separate platform from an enabling team, because they have two very different sets of behavior. The typical behaviors or ways of interacting within the organization from a platform is the platform is providing a service, thus is principal way of interacting with the teams. It's there to offload some of the cognitive load in other teams that are using the platform. It's there to help encourage a fast flow of change through teams that they're using the facilities in the platform. The primary thing we're doing when we work in the platform is providing services that just work really well. They've got a good user experience or developer experience. They're friction free, and so on. That's our focus.
However, the focus in enabling team is very different. The focus in enabling team is to bridge capability gaps inside teams that we're working with, to help those teams understand new technologies, or to migrate from one technology to another, or to adopt a new way of working, whatever. It's two very different ways of working. Now, clearly, if you've got some skilled people in the organization, this week, they can be building something in the platform, maybe next week, they can be acting as part of an enabling team and helping someone understand new technologies around machine learning or something. We wanted to draw a distinction because it's two very different ways of working. We wanted to make it super clear that there's value in characterizing the different ways of working so that we can set expectations. If we've got expectations about how that interaction will feel when we're working with a different team, then if something isn't working about that interaction, we can actually detect it much more quickly, and say, "There's something not quite right here. This doesn't feel right." That was part of the reason, is to gain some clarity about the purpose of different kinds of activity and interaction inside the organization.
Pais: It's also a little bit about maintaining expectations between teams. If you have someone work in a platform team, and let's say, you are doing some enabling work, helping another team learn about monitoring or what have you. You get a call that you need to do support, because there's a problem with the platform service. That's not going to be a good experience. The other team was expecting you to be there to enable them, to help them, and now you have to go and answer to some incident, or something else. Or, a person in the platform team is developing, working on coding and creating some new functionality, but now you're being asked to do some support work. The more you put on the shoulders of the platform teams, also, the more cognitive load they have, the harder it's going to be to respond effectively.
Also, the enabling teams can have much broader domains, than a platform service. I mentioned regulations, GDPR for people familiar with data privacy regulations in the EU, user experience where often organizations don't have enough experts in user experience to meet the needs of all the team. These are areas which you don't necessarily have platform services to start with, at least, and where enabling actually makes a lot of sense. Sometimes, you might need to do some enabling, be more of an enabling team for a period of time just to get teams up to improve their skills around some domain, and then start focusing more on platform.
Pragmatically, we've seen examples where clients say, we don't have effort available to create an enabling team. You might have to start with, let's do a bit of enabling work. Maybe dedicate one person in the platform team for a period of time, but it's not very sustainable in the long run. It might be a starting point.
Shoup: The platform is about an API. Then, how do you have an API around a user experience thing? Then I really love, Manuel, you're like, you can iterate your way where, I start by enabling, and maybe that's the way of I even develop a platform. Working with the team, and start with a pilot and an embedded model.
See more presentations with transcripts