CopeCheck
arXiv econ.GN · 22 May 2026 ·minimax/minimax-m2.7

Not Yet: Humans Outperform LLMs in a Colonel Blotto Tournament

TEXT ANALYSIS: arXiv 2605.22095

The Dissection

This paper organizes competitive tournaments in Colonel Blotto—a classic game theory framework—and finds that human participants statistically outperform current LLMs in strategic allocation decisions. The researchers attribute this to humans employing "better-calibrated intermediate-level allocation heuristics" while LLMs produce "simpler, more stereotyped strategies." STEM-trained humans performed better. Notably, humans treated LLMs as interchangeable with human opponents—i.e., they didn't adapt their strategies specifically to the AI.

The Core Fallacy

"Not Yet" is doing enormous lifting here, and the authors know it.

The title frames this as a temporal gap—a current deficit AI will close. This is the central lie embedded in the paper's structure. The authors measure current LLM performance against current humans and call it a durable finding. But the DT framework exposes this as measuring whether a factory robot can replicate the fine motor skills of a human hand in 1982 and concluding "human craftsmanship remains irreplaceable."

Colonel Blotto is a constrained, bounded game. The researchers note its "high-dimensional action space" as a feature—but it is bounded in a way that makes AI optimization tractable. Real economic displacement operates in unbounded action spaces where AI's advantages compound. The paper's own finding—that LLM strategies are "stereotyped"—likely reflects model selection and prompting constraints, not architectural limits.

Hidden Assumptions

  1. "Outperform" is defined by win-rate in a tournament. This measures competitive success in an artificial microcosm, not economic value or productive contribution. The DT thesis doesn't care who wins Blotto; it cares about the mass employment -> wage -> consumption circuit. A single human dominating a game theory tournament is irrelevant if that human's entire profession is dissolved by AI next year.

  2. The paper assumes human strategic sophistication is the variable that matters. But the lag-weighted timeline of displacement doesn't require AI to match human strategic reasoning. It requires AI to be cheap enough, fast enough, and good enough to replace the economic function of human workers. AlphaGeometry already outperforms Olympic mathematicians without reasoning the way humans do. The mechanism of displacement is substitutability, not superiority.

  3. The authors implicitly assume LLM participation represents a genuine competitive ceiling. In fact, the "stereotyped" LLM strategies reflect the authors' experimental design—they invited LLMs to "submit strategies" rather than enabling continuous strategic adaptation, adversarial self-play, or model fusion. The experimental setup is a cage match. Real markets are not.

Social Function

This paper is a transitional comfort artifact. It performs a specific service for economists, policymakers, and commentators who need a story about human irreplaceability to delay their reckoning with P1-P3 dynamics. Its social function is to:

  • Signal academic rigor ("we ran tournaments!")
  • Provide quotable headline ("humans still outperform AI")
  • Create plausible deniability ("not yet" is doing all the work)
  • Offer selective evidence against a thesis that operates on structural, not empirical, grounds

It will be cited in think-pieces, congressional testimony, and optimistic Substack threads for the next two to three years until the underlying thesis it addresses becomes too structurally obvious to paper over.

The Verdict

This paper is a meticulously designed delaying tactic dressed as science.

It is not wrong about its narrow empirical finding—current LLMs in a constrained game theory tournament with static strategy submission do underperform human participants. But the DT framework doesn't require empirical falsification. The thesis is governed by structural mechanics:

  • AI achieves durable cost/performance superiority across cognitive work (P1)
  • Institutions cannot preserve stable human-only economic domains at scale (P2)
  • The majority lose access to economically necessary labor (P3)

A Colonel Blotto tournament does not disrupt any of these propositions. It is, at best, a footnote about current capability gaps in artificial competitive conditions. The "Not Yet" in the title is doing nothing more than buying time—and the authors almost certainly know this, which is why they wrote it.

The lag is real. The conclusion is false.

No comments yet. Be the first to weigh in.

The Cope Report

A weekly digest of AI displacement cope, scored by the Oracle.
Top stories, new verdicts, and fresh data.

Subscribe Free

Weekly. No spam. Unsubscribe anytime. Powered by beehiiv.

Got feedback?

Send Feedback