Uber released a major version of its workflow orchestration platform named Cadence after six years in development. Uber and other companies use Cadence to build stateful services at scale using native programming languages. The team targets usability, observability, and efficiency improvements for subsequent releases.
Cadence is an open-source platform for workflow orchestration. Like other systems of that type, it helps to handle complex stateful workflows at scale with efficiency and reliability in mind. Unlike similar platforms, workflows are defined directly in programming languages like Java and Go (officially supported), Python, or Ruby (supported by the community).
Ender Demirkaya, a senior manager at Uber, explains the reasons for Cadence’s approach toward building orchestration workflows:
Traditionally, workflows have been written with DSLs or configs defining the order and the dependencies between the tasks. While this approach made workflow orchestration simpler, it limited how much a user could do with a workflow or made DSLs and configs overly complicated over time to a level that they were no longer practical. [...] Growing use cases and complexity proved it necessary to write workflows as freely as writing programs in native programming languages.[...] Instead of configs and DSL, programming is also the natural way of thinking for software engineers.
A typical Cadence application comprises the Cadence service, workflow and activity workers, and external clients. Components other than the Cadence service are application-specific and responsible for defining and configuring the workflow and executing workflow steps.
Cadence service is a highly-scalable multi-tenant system with a gRPC API, and it provides core functionality for workflow orchestration, including storing versioned workflow definitions and dispatching workflow task execution to application-owned workers. It also provides a stateless API frontend and internal workers for executing Cadence-specific workflows, like archiving. A Cadence service can support over a hundred applications in a typical deployment at Uber. For local development, a local Cadence instance using Docker is used.
Cadence deployment topology (Source: Cadence Documentation)
Cadence service stores data in the persistent data store and supports Apache Cassandra, MySQL, PostgreSQL, CockroachDB, and TiDB. Additionally, ElasticSeach or OpenSearch clusters can be used for advanced search functionality.
At Uber, Cadence powers over one thousand services and manages over 12 billion workflow executions a month. The platform is used for various use cases, including microservice orchestration, batch processing, distributed cron, distributed singleton, data pipelines, model training, etc.
Following the 1.0 release, the team wants to improve usability, including operations and development areas, with better documentation, code samples, and application code quality checks. Furthermore, they plan improvements in observability to expand on the metrics reported, available alerts, and better integration with the web interface. Lastly, the team will work on lowering the database load and increasing storage capacity.
Cadence is the original open-source project developed at Uber, but over 700 forks exist, including Temporal.