Powering the agents: Workers AI now runs large models, starting with Kimi K2.5

Kimi K2.5 is now on Workers AI, helping you power agents entirely on Cloudflare’s Developer Platform. Learn how we optimized our inference stack and reduced inference costs for internal agent use cases.

7 April 2026 at 09:01 am

1 views

Powering the agents: Workers AI now runs large models, starting with Kimi K2.5

Cloudflare has announced a significant development in its AI capabilities, as it introduces the Kimi K2.5 model to its Workers AI platform. This move marks a major step forward in enabling developers to build and deploy agents directly on the Cloudflare Developer Platform, providing a unified infrastructure for the entire agent lifecycle.

The Cloudflare Developer Platform has long been a robust environment for building and executing applications, thanks to its suite of tools such as Durable Objects for state management, Workflows for long-running tasks, and Dynamic Workers or Sandbox containers for secure execution. However, these primitives primarily offered an execution environment, leaving the need for a powerful AI model to power the agents.

To address this gap, Cloudflare has now entered the big models game by offering frontier open-source models on its AI inference platform. The company is starting with Moonshot AI's Kimi K2.5 model, which boasts a full 256k context window and supports multi-turn tool calling, vision inputs, and structured outputs. This model is particularly well-suited for various agentic tasks, offering a high degree of reasoning capabilities and efficiency.

By integrating Kimi K2.5 directly into the Cloudflare Developer Platform, the company is making it possible to run the entire agent lifecycle on a single, unified platform. This integration not only simplifies the development process but also ensures that the AI model powering the agent is optimized for performance and cost-efficiency.

Cloudflare has already tested Kimi K2.5 as the engine for its internal development tools, particularly within the OpenCode environment. Cloudflare engineers have been using Kimi as a daily driver for agentic coding tasks, and the model has been integrated into the company's automated code review pipeline. This integration is evident in the public code review agent, Bonk, which operates on Cloudflare GitHub repositories.

In production, Kimi K2.5 has proven to be a fast and efficient alternative to larger proprietary models without compromising on quality. Initially launched as an experiment, the model quickly became critical to Cloudflare's internal operations, demonstrating its value and potential for broader adoption.

This development underscores Cloudflare's commitment to making its platform the best choice for building and deploying agents. By offering a robust infrastructure and integrating cutting-edge AI models like Kimi K2.5, the company is positioning itself at the forefront of the agent development ecosystem. As Cloudflare continues to expand its AI offerings, developers can expect even more powerful and efficient tools to support the creation of intelligent, agentic applications.

Source: The Cloudflare Blog

Ekaya Banaras Founder Palak Shah’s ₹40 Lakh Billboard Mistake Became a Masterclass in Startup Marketing

Ekaya Banaras founder Palak Shah recently opened up about one of the most expensive mistakes she made while building her luxury textile brand. During the early years of the company, Shah rented a premium billboard near Delhi’s DLF Emporio to increase brand visibility. However, after forgetting to cancel the campaign, the hoarding reportedly continued running for months — resulting in losses of nearly ₹40 lakh. The incident has now become a viral example of how small operational oversights can turn into costly business lessons for startups and entrepreneurs.

28 May

Betting On AI: Jensen Huang And NVIDIA’s Rise To The Top

Before AI was inevitable, it was a gamble—and Jensen Huang went all in.

14 Apr

Red Hat OpenShift sandboxed containers 1.12 and Red Hat build of Trustee 1.1 bring confidential computing to bare metal and AI workloads

Red Hat is excited to announce the release of Red Hat OpenShift sandboxed containers 1.12 and Red Hat build of Trustee 1.1, marking a major leap forward in our confidential computing journey. These releases graduate confidential containers on bare metal from …

14 Apr

Large AI firms hoovering maximum funding, not enough for smaller startups: Y Combinator’s Ankit Gupta

YC Startup School: India’s talent pool across colleges and universities are key for building next-gen startups, which is what YC is looking to tap into. It wants to target entrepreneurs building for global markets, focussed on fintech, consumer, B2B, and ecom…

14 Apr

TSMC likely to book fourth straight quarter of record profit on insatiable AI demand

TSMC-RESULTS/ (PREVIEW, PIX):PREVIEW-TSMC likely to book fourth straight quarter of record profit on insatiable AI demand

14 Apr

TSMC likely to book fourth straight quarter of record profit on insatiable AI demand

Any profit result ‌above T$505.7 billion would mark the company's highest-ever quarterly net income and its ninth consecutive quarter of profit growth

14 Apr

TSMC likely to book fourth straight quarter of record profit on insatiable AI demand

On Thursday, TSMC is expected to report a net profit of $17.1 billion for the quarter, according to an LSEG SmartEstimate compiled from 19 analysts. The war in the Middle East threatens to disrupt the supply of production materials for semiconductors such as…

14 Apr

If we can’t kick the habit, how do we manage AI’s energy needs?

One can only hope that OpenAI’s Sam Altman was joking when he sought to justify the immense energy consumption of artificial intelligence

14 Apr

What caused Nvidia Blackwell GPU prices to spike? #tech

Blackwell GPU hourly “rent” surges on agentic AI demand A compute pricing index tracking hourly costs for Nvidia Blackwell GPUs shows a sharp climb: hourly rental hit $4.08 , up 48% from $2.75 just two months earlier. The reported driver is rising demand tied…

14 Apr

Anthropic Releases Claude Mythos Preview with Cybersecurity Capabilities but Withholds Public Access

Anthropic has introduced Claude Mythos Preview, its most advanced AI model, improving significantly in reasoning, coding, and cybersecurity. Unlike previous releases, it will not be publicly available. Access is limited to a consortium of tech companies throu…

14 Apr