Home TechnologyKeep Deterministic Work Deterministic...
Technology⭐ Featured

Keep Deterministic Work Deterministic

This is the second article in a series on agentic engineering and AI-driven development. Read part one here, and look for the next article on April 2 on O’Reilly Radar. The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts […]

6 April 2026 at 07:38 pm
1 views
Keep Deterministic Work Deterministic

In the world of AI-driven development and agentic engineering, the challenge of ensuring reliability in deterministic systems has become a focal point for researchers and developers alike. This is the second article in a series exploring these concepts, building on the first piece and leading up to a third installment on April 2nd on O’Reilly Radar. The series delves into the intricacies of creating systems that can operate with precision and consistency, even when powered by artificial intelligence.

The foundation of this discussion lies in the well-known adage, "The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time." This principle highlights the significant effort required to perfect the final, often overlooked, aspects of a project. In the context of AI-driven development, this challenge is amplified as the systems must not only be created but also refined to ensure they function accurately and reliably.

One experiment at the forefront of this exploration involves a blackjack simulation where an Large Language Model (LLM) plays hundreds of hands against blackjack strategies written in plain English. The AI utilizes these strategy descriptions to make decisions such as hitting, standing, or doubling down in each hand. Meanwhile, deterministic code handles the card dealing, mathematical calculations, and rule verification.

Early iterations of this simulation revealed a 37% pass rate, indicating substantial room for improvement. The LLM frequently made errors in card total calculations, overlooked the dealer's turn, or disregarded the strategy it was supposed to follow. Crucially, these mistakes often compounded, leading to a domino effect of incorrect decisions. For instance, if the model miscounted the player's total on the third card, every subsequent decision would be based on an incorrect number, rendering the entire hand invalid.

To understand the scale of this problem, it's helpful to consider the March of Nines. This concept, coined by Andrej Karpathy from his experience building self-driving systems at Tesla, illustrates that achieving the first 90% of reliability is relatively straightforward, but progressing beyond that requires exponentially more effort. Moving from 90% to 99% reliability takes roughly the same amount of engineering work as going from 99% to 99.9%. Each additional nine on the reliability scale demands comparable resources, and the process never truly ends.

To demonstrate how such failures can compound, one can conduct a simple experiment using an AI chatbot running an early 2026 model, such as ChatGPT 5.3 Instant. By inputting a specific sequence of commands, users can observe firsthand the challenges in achieving consistent, deterministic outcomes in AI-driven systems.

In conclusion, the journey towards reliable and deterministic AI-driven development systems is a complex one, marked by the March of Nines. While initial progress can be swift, the path to true reliability is a long and arduous one, requiring meticulous attention to detail and a deep understanding of both the AI models and the deterministic systems they interact with. As the series continues, we will explore further insights and strategies to overcome these challenges and build systems that operate with the precision and consistency necessary for real-world applications.

Source: Radar
📰 Related News
Ekaya Banaras Founder Palak Shah’s ₹40 Lakh Billboard Mistake Became a Masterclass in Startup Marketing
Ekaya Banaras Founder Palak Shah’s ₹40 Lakh Billboard Mistake Became a Masterclass in Startup Marketing
Ekaya Banaras founder Palak Shah recently opened up about one of the most expensive mistakes she made while building her luxury textile brand. During the early years of the company, Shah rented a premium billboard near Delhi’s DLF Emporio to increase brand visibility. However, after forgetting to cancel the campaign, the hoarding reportedly continued running for months — resulting in losses of nearly ₹40 lakh. The incident has now become a viral example of how small operational oversights can turn into costly business lessons for startups and entrepreneurs.
28 May
Betting On AI: Jensen Huang And NVIDIA’s Rise To The Top
Betting On AI: Jensen Huang And NVIDIA’s Rise To The Top
Before AI was inevitable, it was a gamble—and Jensen Huang went all in.
14 Apr
Red Hat OpenShift sandboxed containers 1.12 and Red Hat build of Trustee 1.1 bring confidential computing to bare metal and AI workloads
Red Hat OpenShift sandboxed containers 1.12 and Red Hat build of Trustee 1.1 bring confidential computing to bare metal and AI workloads
Red Hat is excited to announce the release of Red Hat OpenShift sandboxed containers 1.12 and Red Hat build of Trustee 1.1, marking a major leap forward in our confidential computing journey. These releases graduate confidential containers on bare metal from …
14 Apr
Large AI firms hoovering maximum funding, not enough for smaller startups: Y Combinator’s Ankit Gupta
Large AI firms hoovering maximum funding, not enough for smaller startups: Y Combinator’s Ankit Gupta
YC Startup School: India’s talent pool across colleges and universities are key for building next-gen startups, which is what YC is looking to tap into. It wants to target entrepreneurs building for global markets, focussed on fintech, consumer, B2B, and ecom…
14 Apr
TSMC likely to book fourth straight quarter of record profit on insatiable AI demand
TSMC likely to book fourth straight quarter of record profit on insatiable AI demand
TSMC-RESULTS/ (PREVIEW, PIX):PREVIEW-TSMC likely to book fourth straight quarter of record profit on insatiable AI demand
14 Apr
TSMC likely to book fourth straight quarter of record profit on insatiable AI demand
TSMC likely to book fourth straight quarter of record profit on insatiable AI demand
Any profit result ‌above T$505.7 billion would mark the company's highest-ever quarterly net income ​and its ninth consecutive quarter of profit growth
14 Apr
TSMC likely to book fourth straight quarter of record profit on insatiable AI demand
TSMC likely to book fourth straight quarter of record profit on insatiable AI demand
On Thursday, ​TSMC is expected to report a net profit of $17.1 billion for the quarter, according to an LSEG SmartEstimate compiled from 19 analysts. The war in the Middle East threatens to disrupt the supply of production materials for semiconductors such as…
14 Apr
If we can’t kick the habit, how do we manage AI’s energy needs?
If we can’t kick the habit, how do we manage AI’s energy needs?
One can only hope that OpenAI’s Sam Altman was joking when he sought to justify the immense energy consumption of artificial intelligence
14 Apr
What caused Nvidia Blackwell GPU prices to spike? #tech
What caused Nvidia Blackwell GPU prices to spike? #tech
Blackwell GPU hourly “rent” surges on agentic AI demand A compute pricing index tracking hourly costs for Nvidia Blackwell GPUs shows a sharp climb: hourly rental hit $4.08 , up 48% from $2.75 just two months earlier. The reported driver is rising demand tied…
14 Apr
Anthropic Releases Claude Mythos Preview with Cybersecurity Capabilities but Withholds Public Access
Anthropic Releases Claude Mythos Preview with Cybersecurity Capabilities but Withholds Public Access
Anthropic has introduced Claude Mythos Preview, its most advanced AI model, improving significantly in reasoning, coding, and cybersecurity. Unlike previous releases, it will not be publicly available. Access is limited to a consortium of tech companies throu…
14 Apr