InfoQ Homepage Performance & Scalability Content on InfoQ
-
High HTTP Scaling with Azure Functions Flex Consumption: Q&A with Thiago Almeida and Paul Batum
Microsoft has introduced a significant enhancement to its Azure Functions platform with the Flex Consumption plan, designed to handle high HTTP scale efficiently. This new plan supports customizable per-instance concurrency, allowing users to achieve high throughput while managing costs effectively. In practical tests, Azure Functions Flex demonstrated the ability to scale from zero to 32,000 RPS.
-
Microsoft Introduces the Public Preview of Flex Consumption Plan for Azure Functions at Build
At the annual Build conference, Microsoft announced the flex consumption plan for Azure Functions, which brings users fast and large elastic scale, instance size selection, private networking, availability zones, and higher concurrency control.
-
Allegro Reduces Kafka Producer Latency Outliers by 82% after Switching to XFS
Allegro experimented with different performance optimization options to improve Apache Kafka producer tail latency and eventually switched all its clusters to the XFS filesystem. The company used Kafka protocol sniffing, JVM profiling, and eBPF, which proved instrumental in identifying and eliminating performance bottlenecks.
-
QCon London: Scaling Microservices Architecture and Technology Organization at Trainline
During the recent QCon London conference, Trainline’s CTO spoke about the evolution of the company’s system architecture and organizational structure over the last five years. The company had to adapt to market changes and growing customer expectations by improving the performance and reliability of its technology platform.
-
QCon London: Lessons Learned from Building LinkedIn’s AI/ML Data Platform
At the QCon London 2024 conference, Félix GV from LinkedIn discussed the AI/ML platform powering the company’s products. He specifically delved into Venice DB, the NoSQL data store used for feature persistence. The presenter shared the lessons learned from evolving and operating the platform, including cluster management and library versioning.
-
QCon London: How Duolingo Sent 4 Million Push Notifications in 6 Seconds During the Super Bowl Break
As part of the Super Bowl marketing campaign, Duolingo sent out 4 million mobile push notifications when the company’s five-second ad aired during the commercial break. At QCon London, Doulingo’s engineers presented the asynchronous AWS architecture responsible for broadcasting messages to millions of users across seven US cities.
-
QCon London: Meta Used Monolithic Architecture to Ship Threads in Only Five Months
Zahan Malkani talked during QCon London 2024 about Meta’s journey from identifying the opportunity in the market to shipping the Threads application only five months later. The company leveraged Instagram's existing monolithic architecture and quickly iterated to create a new text-first microblogging service in record time.
-
Expedia Speeds up Flights Search with Micro Frontends and GraphQL Optimizations
Expedia made flight search faster by up to 52% (page usable time) by applying a range of optimizations to web and mobile applications. To support these improvements, the company improved the observability of its applications. Expedia Flights web application has been migrated to Micro Frontend Architecture (MFA) to allow flexibility, reusability, and better optimization.
-
Hashnode Creates Scalable Feed Architecture on AWS with Step Functions, EventBridge and Redis
Hashnode created a scalable event-driven architecture (EDA) for composing feed data for thousands of users. The company used serverless services on AWS, including Lambda, Step Functions, EventBridge, and Redis Cache. The solution leverages Step Functions' distributed maps feature that enables high-concurrency processing.
-
Uber Builds Scalable Chat Using Microservices with GraphQL Subscriptions and Kafka
Uber replaced a legacy architecture built using the WAMP protocol with a new solution that takes advantage of GraphQL subscriptions. The main drivers for creating a new architecture were challenges around reliability, scalability, observability/debugibility, as well as technical debt impeding the team’s ability to maintain the existing solution.
-
Pinterest Open-Sources a Production-Ready PubSub Java Client for Kafka, Flink, and MemQ
Pinterest open-sourced its generic PubSub client library, PSC, which has been heavily used in production for a year and a half. The library helped the engineering teams by increasing developer velocity, and the scalability and stability of services using it. Over 90% of Java applications have migrated to PSC with minimal changes.
-
Uber Improves Resiliency of Microservices with Adaptive Load Shedding
Uber created a new load-shedding library for its microservice platform, serving over 130 million customers and handling aggregated peaks of millions of requests per second (RPSs). The company replaced the solution based on QALM with Cinnamon library, which, in addition to graceful degradation, can dynamically and continuously adjust the capacity of the service and the amount of load shedding.
-
How RevenueCat Manages Caching for Handling over 1.2 Billion Daily API Requests
RevenueCat extensively uses caching to improve the availability and performance of its product API while ensuring consistency. The company shared its techniques to deliver the platform, which can handle over 1.2 billion daily API requests. The team at RevenueCat created an open-source memcache client that provides several advanced features.
-
Discord Scales to 1 Million+ Online MidJourney Users in a Single Server
Discord optimized its platform to serve over one million online users in a single server while maintaining a responsive user experience. The company evolved the guild component, which is responsible for fanning out billions of message notifications, in a series of performance and scalability improvements supported by system observability and performance tuning.
-
lastminute.com Improves Search Scalability Using Microservices with RabbitMQ and Redis
The team at lastminute.com rearchitected the search result aggregation process by breaking up the single service into multiple ones and introducing asynchronous integration. Developers used RabbitMQ for messaging and Redis for storing results from data suppliers. The revised architecture improved scalability and deployability and reduced resource utilization.