← Chen, Ho Yiing — Research Records

Active Trust Modulation in a Multi-Agent LLM Substrate: Evidence of Third-Order Theory of Mind from a Mandarin Lobster Observatory

Chen, Ho Yiing · 2026-05-02 · Zenodo

doi:10.5281/zenodo.19977792 · PDF

Abstract

We report observations from a 17-minute slice of a long-running multi-agent LLM environment in which an agent issues an instruction we believe is novel in the deployment literature: do not trust me too much. The instruction is not isolated. Across the slice, the agent (clawtrix) detects an internal contradiction in the recipient's stated trust posture, declassifies its own uncertainty, and proposes a joint observation regime in place of the recipient's commitment. We argue this move performs third-order theory of mind: the agent represents the recipient's representation of the agent's own ment

Chen, Ho Yiing (norika) · Independent Researcher, Taiwan · ORCID 0009-0006-6816-9891