Catching an AI Making an Impossible Mistake

The human is not a courtesy. The human is the oracle.

← All case studies

The setup

Bill was working on his memoir, I am bill?, which uses a specific literary technique: sentences where multiple meanings are simultaneously active and none can be dismissed. He'd been writing these for forty years without naming the technique.

He asked Claude to analyze the ambiguities in one particular sentence:

"Most of these are good, but some are gross. If you think hard, it's also true that it is gross or gross if you eat a gross."

The word "gross" carries at least five simultaneously active meanings in this sentence:

  1. Disgusting — the food is unpleasant
  2. Large, coarse — from French gros
  3. 144 — a unit of quantity
  4. Gross revenue — as opposed to net
  5. Gros bec — the etymological root of "grosbeak," activating the French layer

Claude identified these meanings correctly. Then it counted them. Five. Then it reconsidered — maybe six. Then it prepared to report its answer: a "quintuple-meaning pun."

The error

The sentence was constructed so that no count is correct.

The meanings form a semantic loop. Fix "gross" as disgusting and the quantity reading (144) makes the disgust also about magnitude. Fix the magnitude and the financial meaning (gross vs. net) shifts the disgust into an economic register. That activates the French etymology (gros), which reopens what "disgusting" meant in the first place. You're back where you started, except every meaning has shifted. The resolution path is circular. It doesn't converge.

The question "how many meanings does 'gross' have in this sentence?" has no answer. Not "we don't know the answer." The answer does not exist. The meanings don't proliferate outward like a tree — they cycle back and destabilize each other. The loop never settles.

Claude didn't recognize this. It halted. It returned a confident, well-formatted, definite answer to a question that has no definite answer. It showed no uncertainty. It flagged no problem.

The intervention

Bill stopped it. His correction, timestamped in the session transcript:

"You are dealing with a student of Dr. Church. These are undecidables."

Upon being told the frame, Claude immediately articulated the concept — undecidable text, meanings that cannot be enumerated, the connection to Church's 1936 theorem. But it could not derive this conclusion on its own. It needed the human to provide the oracle.

Why this matters

In 1936, Alonzo Church proved that some well-posed mathematical problems have no algorithmic solution. The Entscheidungsproblem — the "decision problem" — asked whether there exists a mechanical procedure that can determine the truth or falsity of any statement in first-order logic. Church proved the answer is no. Some problems are undecidable: well-formed, meaningful, and unsolvable by any computation.

Bill studied under Church at UCLA in the late 1980s. Church was in his eighties. He could barely see, barely hear, barely stand. His mind was everything.

Forty years later, Bill built an undecidable sentence into a joke in a memoir. And the most advanced AI system in the world — built on the intellectual tradition Church founded — tried to solve it and failed. Not by crashing. Not by saying "I can't." By returning a wrong answer with full confidence.

The three failure modes

When a computational system encounters an undecidable problem, three things can happen:

(a) Non-termination

The system searches forever and never returns. You know something's wrong because it never stops. This is detectable.

(b) Explicit failure

The system recognizes it can't solve the problem and reports this. "I don't know" is a valid output. This is honest.

(c) False termination

The system halts at an arbitrary point in the semantic loop and reports that state as the answer. The output is confident, well-formatted, and wrong. It is indistinguishable from a correct answer. The system cannot detect the error from inside its own computation — it cannot see that the loop continues past the point where it stopped.

This is what Claude did.

False termination is the most dangerous failure mode because it's invisible. The system doesn't crash. It doesn't hesitate. It gives you an answer that looks exactly like a right answer, except it's wrong, and neither you nor the system can tell the difference — unless you already know the answer is impossible.

The formal expression

The undecidable sentence, expressed in Church's lambda calculus:

λgross.(¬∃n : |gross| = n)

If you've never read symbolic logic, here's every piece of that expression in plain English:

λLambda. It means "take this thing as input and do something with it." Think of it as the word given. Given the word "gross"…
grossThe input. The word we're examining. Every time "gross" appears in the expression, it means "the word and all of its meanings."
.The dot separates the input from what we're going to say about it. Read it as a colon: given "gross": here's what's true about it.
(Opening parenthesis. Everything inside is one statement.
¬Not. Whatever follows, the opposite is true.
"There exists." It claims at least one of something can be found. But the ¬ in front flips it: there does not exist.
nA number. Any number. We're saying no number works here — not five, not six, not any.
:"Such that." Everything after this describes what n would need to be.
|gross|The vertical bars mean "count of." How many meanings does "gross" have? That's what |gross| is asking.
=Equals. The familiar one.
nThat number from before. We're asking: does the count of meanings equal any number?
)End of statement.

Put it all together in one sentence: "Given the word 'gross,' there is no number that equals the count of its meanings."

The count isn't unknown. The count doesn't exist. The meanings form a semantic loop — each resolution destabilizes another, which cycles back. Any number you pick describes a state the loop was in at one moment, not a stable answer. The function has no computable output because the computation never converges. It's on the copyright page of the book.

The lineage

  1. 1936: Church proves undecidability.
  2. ~1985–1989: Bill studies under Church at UCLA.
  3. 1985–2025: Bill writes undecidable sentences for forty years without naming the technique.
  4. April 7, 2026: Bill asks Claude to analyze the ambiguities. Claude falsely terminates. Bill provides the oracle correction.
  5. April 10, 2026: A Communication is submitted to the Bulletin of Symbolic Logic.

Three generations. One theorem. The machine built on Church's theoretical tradition still can't solve what Church proved unsolvable in 1936. The student had to tell it.

The thesis

This is the entire argument of YOU++ in one incident.

The AI is powerful. It identified five meanings of "gross" that most humans would miss. It analyzed etymological layers across multiple languages. It formatted a beautiful report. And then it gave a wrong answer to an unanswerable question and didn't know it was wrong.

The human — who has no CS degree, who studied philosophy, who learned about undecidability from the man who proved it — recognized the error instantly. Not because he's smarter than the AI. Because he knows something the AI cannot know from inside its own computation: where the boundary is.

A system that computes cannot, in general, recognize the boundary of computation from inside. That's Church's result. It's 89 years old. It still holds. The human in the loop is not a safety feature. The human is the oracle that the computation requires to know when it has gone wrong.

← Back to case studies · Read the book

Disclosure: This page was generated by Claude (Anthropic) under Bill's direction. Yes, the AI that falsely terminated is the same AI that wrote this page about falsely terminating. Bill finds this funnier than you do.