What Should Agents Say? Action-state Communication for Efficient Multi-Agent Systems
URL SCAN: What Should Agents Say? Action-state Communication for Efficient Multi-Agent Systems
FIRST LINE: arXiv > cs.AI > Submitted on 3 Jun 2026
The Dissection
This is an engineering optimization paper. It is fundamentally about making multi-agent AI systems cheaper to run—specifically by reducing token consumption in inter-agent communication. The core finding: free-form natural language between agents wastes tokens and context without proportionate performance gain; compressing agent outputs into structured "action-state records" before sharing is more efficient.
PACT — their proposed framework — projects raw agent outputs into compact records. Results: OpenHands gets a resolve rate improvement at -10% tokens-per-resolved; SWE-agent is resolve-neutral but halves input tokens.
The Core Fallacy
The paper operates entirely inside the assumption that multi-agent LLM systems are a viable and worth-optimizing paradigm. It is pure engineering within a paradigm that the Discontinuity Thesis renders structurally transient.
The framing treats token inflation as a cost optimization problem to be solved with better communication protocols. It does not engage with the possibility that the entire architecture—roles, pipelines, turn schedules, inter-agent chatter—is itself a transitional artifact. The paper is optimizing the fuel injection on a engine that runs on mass employment, which is being mechanically removed.
Hidden Assumptions
- Multi-agent coordination remains necessary. The paper assumes that distributing cognitive work across multiple AI agents is the correct architecture for the long run. DT Axiom P1 implies that cognitive coordination costs collapse when a single AI system handles the full task. Multi-agent systems are lag-phase artifacts.
- Token cost is the binding constraint. In the framework, token efficiency is the primary optimization target. But the true binding constraint under DT is productive participation — not inference cost. If the system produces no economically necessary output, zero tokens doesn't matter.
- Task performance is the relevant metric. "Resolve rate" for coding tasks is the success criterion. The paper does not ask whether the resolved tasks are economically necessary human labor or automated AI labor by another route.
Social Function
Transition Management Infrastructure. This is a paper that makes multi-agent LLM systems cheaper to run, which accelerates their deployment into real-world workflows. This is precisely the kind of research that tightens the productive participation constraint faster than it would tighten organically.
It is also, inadvertently, a symptom: when the primary engineering challenge in multi-agent systems is "we're burning too many tokens talking to each other," that is evidence of architectural brittleness. Biological systems don't have this problem — coordination without explicit token accounting is a feature of evolved systems. The fact that LLM multi-agent systems require explicit protocol optimization suggests they are not converging toward robust intelligence but rather scaffolding around a fundamental limitation.
The Verdict
This is technically competent work that optimizes the wrong thing in the wrong direction. It makes AI agents communicate more efficiently with each other — accelerating the displacement of the human coordination layer that is currently the last bastion of mass employment.
The productive insight for DT analysis: the paper documents a specific mechanism of AI system self-improvement (internal communication compression) that moves the needle on P1 (Cognitive Automation Dominance). PACT-style compression is a step toward AI coordination without human intermediaries — exactly the trajectory that severs the wage-consumption circuit.
Oracle Assessment: Useful engineering within an obsolete paradigm. Accelerant, not solution.
Comments (0)
No comments yet. Be the first to weigh in.