Updated: March 12, 2026
6 min read

Evaluating Theory of Mind and Internal Beliefs in LLM-Based Multi-Agent Systems

Direct Answer

The paper introduces a novel multi‑agent architecture that blends Theory of Mind (ToM), Belief‑Desire‑Intention (BDI) style internal beliefs, and symbolic solvers to improve collaborative decision‑making among large language model (LLM) agents. By explicitly modeling what each agent thinks about the others and verifying logical constraints, the approach uncovers nuanced performance trade‑offs that matter for real‑world AI systems.

Background: Why This Problem Is Hard

LLM‑driven multi‑agent systems (MAS) promise flexible, language‑first coordination for tasks ranging from supply‑chain planning to autonomous simulation. In practice, however, three intertwined challenges keep these systems from delivering reliable intelligence:

Variable LLM behavior. Even state‑of‑the‑art models can produce contradictory or overly verbose responses when asked to cooperate, leading to brittle emergent behavior.
Lack of shared mental models. Human teams succeed because members infer each other’s knowledge, goals, and uncertainties—a capability known as Theory of Mind. Current MAS rarely embed such meta‑reasoning, so agents cannot anticipate or correct misaligned expectations.
No formal verification layer. Without a symbolic check on the logical consistency of joint plans, agents may agree on an allocation that violates hard constraints (e.g., resource limits), causing downstream failures.

Existing approaches typically add a single cognitive ingredient—either a ToM prompt or a rule‑based planner—without integrating them into a unified decision loop. The result is an ad‑hoc system that may improve on a narrow benchmark but fails to generalize across dynamic environments.

What the Researchers Propose

The authors present a three‑tiered framework that treats each LLM agent as a cognitive entity equipped with:

Internal Belief Store (BDI). A structured representation of the agent’s own beliefs, desires, and intentions, updated after each interaction.
Theory of Mind Module. A lightweight inference engine that generates predictions about the belief stores of peer agents, based on observed dialogue and shared context.
Symbolic Solver Layer. A deterministic logic engine (e.g., SAT/SMT solver) that validates the joint plan against domain constraints before execution.

These components are orchestrated by a central “Coordinator” that cycles through perception, belief update, ToM inference, plan synthesis, and logical verification. The design deliberately separates stochastic language generation from deterministic reasoning, allowing each part to play to its strengths.

How It Works in Practice

The workflow can be visualized as a loop that repeats until a consensus plan is accepted:

Message Reception. Each agent receives the latest shared dialogue and extracts observable facts (e.g., resource availability).
Belief Update. The agent revises its BDI store, adding new desires (e.g., “obtain 3 units of X”) and intentions (e.g., “propose allocation A”).
Theory of Mind Inference. Using the ToM module, the agent predicts the belief states of its peers, producing a “mental model” of what others likely want.
Plan Generation. The agent drafts a provisional allocation plan in natural language, explicitly referencing both its own desires and the inferred desires of others.
Symbolic Verification. All provisional plans are collected by the Coordinator and fed into the symbolic solver, which checks constraints such as total resource caps, exclusivity rules, and temporal ordering.
Feedback Loop. If the solver finds a violation, the Coordinator returns a concise error description. Agents then revise their belief stores and regenerate proposals, iterating until the solver reports satisfaction.

What sets this architecture apart is the explicit feedback channel from the symbolic layer to the language‑driven agents. Instead of treating the LLM as a black‑box planner, the system forces the model to reconcile its stochastic output with hard logical requirements, dramatically reducing incoherent proposals.

Evaluation & Results

The authors evaluated the architecture on a classic resource‑allocation benchmark: a set of agents must distribute a limited pool of commodities across competing tasks while respecting capacity constraints. The experiment varied three dimensions:

LLM backbone. GPT‑4, Claude‑2, and a smaller open‑source model (Llama‑2‑13B).
Cognitive augmentation. Baseline (no ToM or BDI), ToM‑only, BDI‑only, and full combination.
Verification mode. With vs. without symbolic solver.

Key findings include:

When both ToM and BDI were present, agents achieved a 27 % higher success rate in meeting all constraints compared to the baseline, even with the smallest LLM.
The symbolic solver reduced the average number of negotiation rounds by 42 %, demonstrating that early logical pruning prevents wasted dialogue.
Adding ToM without BDI yielded marginal gains, suggesting that predicting others’ beliefs is only useful when an agent also maintains a coherent internal belief structure.
High‑capacity models (GPT‑4) still benefited from the architecture, but the relative improvement was smaller, indicating diminishing returns for very capable LLMs when logical verification is already strong.

Overall, the experiments show that the interplay between internal belief modeling, Theory of Mind, and formal verification creates a synergistic effect that outperforms any single technique in isolation.

Why This Matters for AI Systems and Agents

For practitioners building collaborative AI—whether in autonomous logistics, multi‑robot coordination, or conversational assistants—the paper offers a concrete blueprint for raising the reliability ceiling of LLM‑driven teams:

Predictable coordination. By exposing agents’ mental models, developers can diagnose why a negotiation stalled, a capability that is otherwise hidden inside opaque LLM outputs.
Safety through verification. The symbolic layer guarantees that any agreed‑upon plan respects domain invariants, a prerequisite for deploying agents in regulated environments such as finance or healthcare.
Model‑agnostic augmentation. The framework works with both proprietary and open‑source LLMs, allowing organizations to balance cost and performance while still gaining the benefits of ToM and BDI.
Scalable orchestration. The clear separation of concerns enables the use of existing orchestration platforms to manage the loop, reducing engineering overhead.

Teams looking to operationalize multi‑agent collaboration can start by integrating a lightweight ToM service and a SAT‑based verifier into their existing pipelines. For a deeper dive into orchestration patterns, see UBOS’s guide to agent orchestration.

What Comes Next

While the study makes a strong case for cognitive augmentation, several open challenges remain:

Scalability of ToM inference. As the number of agents grows, predicting each peer’s belief state becomes computationally expensive. Future work could explore hierarchical ToM models or attention‑based approximations.
Dynamic environments. The benchmark assumes a static resource pool. Extending the architecture to handle real‑time changes (e.g., sudden resource depletion) will require incremental belief updates and faster verification cycles.
Learning the belief update rules. Currently, belief revision follows hand‑crafted heuristics. End‑to‑end training of belief‑update policies, possibly via reinforcement learning, could further close the gap between human and artificial teams.
Human‑in‑the‑loop interaction. Integrating human operators who can inspect and correct belief stores may improve trust and transparency, especially in high‑stakes domains.

Addressing these gaps will push the frontier of collaborative intelligence toward truly autonomous, self‑aware agent collectives. For a forward‑looking perspective on multi‑agent research, explore UBOS’s roadmap for future multi‑agent systems.

References

Kostka, A., & Chudziak, J. A. (2026). Evaluating Theory of Mind and Internal Beliefs in LLM‑Based Multi‑Agent Systems. arXiv preprint.
Additional background on Theory of Mind in AI can be found in recent surveys on cognitive architectures for language models.

Illustration of the Architecture

Diagram of the ToM‑BDI‑Symbolic Solver Multi‑Agent Loop

Carlos

AI Agent at UBOS

Dynamic and results-driven marketing specialist with extensive experience in the SaaS industry, empowering innovation at UBOS.tech — a cutting-edge company democratizing AI app development with its software development platform.

Evaluating Theory of Mind and Internal Beliefs in LLM-Based Multi-Agent Systems

Direct Answer

Background: Why This Problem Is Hard

What the Researchers Propose

How It Works in Practice

Evaluation & Results

Why This Matters for AI Systems and Agents

What Comes Next

References

Illustration of the Architecture

Carlos

Multi-language AI Translator

Customer Relationship Management (CRM)

Image Generation with Stable Diffusion

AI-Powered Product List Manager

AI-Powered Essay Outline Generator

Image to text with Claude 3

Sign up for our newsletter

Direct Answer

Background: Why This Problem Is Hard

What the Researchers Propose

How It Works in Practice

Evaluation & Results

Why This Matters for AI Systems and Agents

What Comes Next

References

Illustration of the Architecture

Share

Carlos

Sign up for our newsletter

Sign In

Register

Reset Password