BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Confluent Cloud for Apache Flink is Now Generally Available with AI Features

Confluent Cloud for Apache Flink is Now Generally Available with AI Features

This item in japanese

Confluent announced last month the general availability (GA) of Confluent Cloud for Apache Flink. This fully-managed service enables real-time data processing and the creation of high-quality, reusable data streams. The service is available across Amazon Web Services (AWS), Google Cloud, and Microsoft Azure.

Apache Flink, a trusted and widely used stream processing framework, has been adopted by companies like Airbnb, Uber, LinkedIn, and Netflix. Its popularity is evident because it was downloaded almost one million times in 2023, according to the company. Confluent decided to build a cloud offering for Apache Flink, allowing users to focus on their business logic without worrying about operations. The GA release is now available after last year's open preview.

Confluent facilitates the integration of Flink with the ANSI SQL standard, simplifying real-time data exploration and application development. Developers can quickly write and run Flink SQL queries through Confluent's CLI (SQL Shell) and SQL Workspaces, with features like auto-completion and graphical interfaces for efficient workflow management.

SQL client in Confluent Cloud CLI (Source: Confluent blog post)

Additionally, Confluent's Data Portal and recently introduced Flink Actions offer self-service data exploration and pre-built stream processing transformations, making it easier for users to leverage the power of Flink without deep expertise.

The fully managed Flink service is a serverless offering with features like elastic autoscaling, automated updates, and usage-based billing. It ensures efficient resource utilization and high reliability.

The company writes:

With Confluent Cloud for Apache Flink, automatic updates reduce the risk of downtime or data loss due to outdated software or vulnerabilities. Additionally, our autoscaler ensures efficient resource allocation and reduces the risk of performance bottlenecks, throttling, or failures during peak usage periods.

Additionally, integration with Kafka and enhancements in metadata management and security further streamline operations and ensure data integrity.

Confluent will also include a new upcoming feature for the Apache Flink service with an AI Model Inference designed to streamline data cleaning and processing tasks for accelerated AI and ML application development. This feature facilitates simplified AI development by allowing organizations to utilize familiar SQL syntax to interact directly with AI/ML models, minimizing the need for specialized tools and languages.

The company writes in a recent blog post:

With AI Model Inference in Confluent Cloud for Apache Flink, organizations can use simple SQL statements to make calls to remote model endpoints, including OpenAI, AWS Sagemaker, GCP Vertex, and Azure, and orchestrate data cleaning and processing tasks on a single platform.

Jean-Sébastien Brunner, director of product management at Confluent, told InfoQ about the upcoming AI Model Inference feature:

Users can define AI/ML models directly in Flink SQL, eliminating the need for custom code or user-defined functions to call LLMs like OpenAI, thereby simplifying the integration of AI/ML into Flink applications. This approach allows users to call models using familiar SQL syntax, facilitating seamless reuse across different applications and eliminating the need to switch between languages or tools for data and AI/ML tasks.

In addition, he added how organizations can leverage this feature in practice:

A data engineer can construct a data pipeline for personalized e-commerce recommendations, ingesting customer interactions and product data. Real-time interactions feed into an LLM model for inference, producing personalized recommendations based on user behavior. All this is managed seamlessly in SQL without switching contexts or relying on external teams.

Lastly, the company will add new auto-scaling clusters called Freight Clusters to the service tailored for high-throughput use cases with flexible latency requirements. This will enhance cost efficiency by seamlessly adjusting resources to demand without manual intervention.

About the Author

Rate this Article

Adoption
Style

BT