Strongly Bounded AI: Definitions and Strategic Implications

A 2025 draft arguing that AI safety discussion lacks terminology for deliberately limited, highly-predictable AI systems aimed at takeover risk rather than human misuse. It proposes the term “Strongly Bounded AI” (alternatives offered: “Highly-Reliable AI,” “Boring AI”) for systems whose limited capability, reliance on well-established technology, and restricted intelligence make their behavior predictable, and distinguishes the concept from alignment, safety, control, scalable oversight, Comprehensive AI Services, and Guaranteed Safe AI. It draws an analogy to engineering’s preference for “boring tech” and to established reliability subfields, and argues that bounded systems should be used to help reason about and control frontier AI, framing the capability gap between frontier and controlled systems as the measure of risk. The draft lists possible applications (data oversight, strategic recommendations, resource security, high-reliability operations, audit assistance) and answers several anticipated objections. It is marked exploratory; a closing “Other unique bits” section collects loose points from a rougher second pass, including a caveat that the strategy depends on trusting those in power. Collaborator comments have been removed.

Strongly Bounded AI: Definitions and Strategic Implications

Ozzie Gooen - April 14 2025, Draft. Quick post for the EA Forum / LessWrong.

Epistemic status: Exploratory concept with moderate confidenceThis represents my current thinking on a potentially valuable framing for AI safety, drawing on established engineering principles. In the last few years discussion around these topics has exploded - I wouldn’t be surprised if there were great existing works that I don’t know about and can’t quickly find.

—

I feel the AI safety conversation lacks terminology for limited, safe, and useful AI systems that address takeover risks rather than misuse by humans. This concept goes beyond alignment to include capability restrictions, reliance on established technologies, and intelligence limitations for predictability.On the Terminology

One thing I feel is missing from AI safety conversations is strong and versatile terminology for limited, safe, and useful AI systems.

This concept isn’t just about alignment. It’s also about:

Substantial capability restrictions (using older models, strong compute limits)
Exclusive use of highly-tested and well-established technologies
Intelligence limitations that make behavior highly predictable I think some potential names for these systems could be:
Strongly Bounded AI
Highly-Reliable AI
Boring AI For the rest of this document, I’ll go with “Strongly Bounded AI.””Strongly Bounded AIs” are not necessarily ones with substantial alignment or safeguards - but rather, AIs we can reason to not represent severe AI takeover risks. This means they can either be very weak systems (like many of the systems of today) without safeguards, or stronger systems with a much greater degree of safeguards.

We already have somewhat understood areas of “Control,” “Scalable Oversight,” etc., which approach similar ideas. But I believe these systems typically investigate “specific AI layers directly overseeing risky AIs” rather than broader AI services/agents that are doing more regular duties in the world.

We also have the fields of “Comprehensive AI Services” (see Drexler) or Guaranteed Safe AI (See Davidad). These are closer to the idea I’m proposing, but are more specific. I think neither is necessary for “Strongly Bounded AI”.

A “Bounded” AI is also arguably different from an aligned or a safe AI. Both “aligned” and “safe” at this point have fairly broad and imprecise definitions, in comparison. I’d also flag that “Boundedness” is really about accident risks, not mistake risks. A bad actor could use a bounded system to do significant harm. This is akin to the importance of reliability in military technology - such reliability is useful for the military, but obviously can still be used destructively if desired.

Engineering Culture and Established Patterns

In tech companies, there’s an established virtue of using “boring tech” – Postgres, SQL, REST, etc. There’s always something fancier trending on Hacker News, but these cutting-edge systems come with major uncertainties and liabilities. Typically, new programmers enthusiastically advocate for the latest JavaScript framework while experienced engineers spend time arguing for proven technologies.

Engineering also features many well-understood and distinct subfields for highly-reliable systems: “Fault-Tolerant Systems,” “Ultra-Reliable Systems,” “High-Assurance Systems,” “Formal Verification,” etc. I believe these concepts effectively carve out market positions for unusually secure technology. Major software products like Microsoft Windows or Google Docs don’t advertise themselves as “Fault-Tolerant Systems” or “Formally Verified” – these terms are reserved for genuinely reliability-focused systems. While these terms can function as marketing buzzwords, I think they still help establish meaningful categories.

Current State and Future Potential

I think most AI agents today are weak and highly limited. I don’t expect 99% of them could cause catastrophic damage (say, $100B in damages) anytime soon – the technology is simply too weak and expensive. I’d feel fairly comfortable using many current systems without worrying about major alignment risks.

My strong expectation is that a tremendous amount of good and global stability could come from developing “Strongly Bounded AIs.” And perhaps most importantly, I think the game plan should entail using “Strongly Bounded AIs” to help us reason about, develop, and control cutting-edge AI technologies.

I believe the real AI takeover threat comes from frontier AI agents. I think the capability gap between frontier models and our controlled AI systems represents the potential damage frontier AI could cause. If it’s “a powerful malicious frontier AI agent” versus humans alone, there’s a massive potential for takeover. If it’s the same agent versus “robust, reliable and controllable AIs,” I’d feel much better about our defensive position.

Applications and Evolution

Over time, I think we’ll develop better methods for creating “Strongly Bounded AIs” that push the frontier of effectiveness while maintaining safety. One of the main things we should probably do with cutting-edge AIs (to the extent we use them) is to help us create better “Strongly Bounded AIs.”

What could “Strongly Bounded AIs” do? In my view:

Oversee personal data on devices
Make strategic recommendations for organizations
Secure key resources beyond traditional access management (e.g., AI monitoring bank withdrawals for signs of duress)
Handle bounded high-reliability operations in medicine and defense
Assist auditors examining potentially dangerous organizations/systems

Addressing Common Questions

“Doesn’t delegating to AI systems increase takeover risks?” I think this is an oversimplified view and would often argue the opposite. I’d expect that “Strongly Bounded AIs” could make the world much more secure against frontier adversarial AIs, not less. But of course, one would need to implement smart tradeoffs.

“Isn’t opposing frontier AI while supporting limited AI confusing?” I think engineering has a long history of distinguishing between safe and unsafe technologies. I don’t think the difference between AI systems is unusually strange compared to previous work in reliability engineering and computer security.”Won’t this term become meaningless marketing?” I’m not that cynical. I think safety-minded people should develop clear standards for safe systems, then work to form the language. We already have some terminology for highly-trustworthy technology. Even when one term gets semantically diluted due to marketing, others can emerge to take its place.

Ultimately, we want systems where:

We are strongly able to predict their behavior patterns. We can have assurances that the downside risks to using them are minimal.

Other unique bits

A few points from a rougher second pass of this draft, not covered above:

It depends on trusting those in power. There are, of course, many ways one could mess up such a strategy — so promoting it depends partly on how reasonable one expects those in power to be.
Framed as objections: “Isn’t this just Drexler’s Comprehensive AI Services?” and “Isn’t this just Davidad’s Guaranteed Safe AI?” — both are close, but I’d treat each as one possible (and not necessary) route to a Strongly Bounded AI rather than the whole idea.
Strongly Bounded AIs could also act as assistants to auditors of dangerous organizations or systems (cf. “superhuman governance”).