MASPOB: Bandit-Based Prompt Optimization for Multi-Agent Systems with Graph Neural Networks

MASPOB (Multi-Agent System Prompt Optimization via Bandits) is a novel framework that efficiently optimizes text prompts for multi-agent AI systems. It combines bandit algorithms for sample efficiency, Graph Neural Networks (GNNs) to manage system topology, and a decomposition strategy to address combinatorial search challenges. The framework achieves state-of-the-art results by balancing exploration and exploitation while modeling how prompts propagate through agent networks.

MASPOB: Bandit-Based Prompt Optimization for Multi-Agent Systems with Graph Neural Networks

New Framework MASPOB Tackles the Critical Challenge of Optimizing Prompts for Multi-Agent AI Systems

In a significant advancement for deploying complex AI systems, researchers have introduced MASPOB (Multi-Agent System Prompt Optimization via Bandits), a novel framework designed to efficiently optimize the text prompts that orchestrate Multi-Agent Systems (MAS). As large language models (LLMs) become the cognitive backbone for these systems, their performance is critically dependent on prompt quality, yet real-world optimization is hindered by high costs and complexity. MASPOB addresses this by combining bandit algorithms for sample efficiency, Graph Neural Networks (GNNs) to manage system topology, and a decomposition strategy to tame the combinatorial search space, achieving state-of-the-art results in benchmarks.

The Core Challenges in Multi-Agent Prompt Optimization

Optimizing prompts for a network of interacting AI agents presents unique hurdles not found in single-model scenarios. The performance of an MAS workflow is highly sensitive to the input prompts guiding each agent, but modifying the underlying workflow code is often impossible in locked deployment environments. This makes prompt optimization the primary lever for improvement. However, three major challenges impede progress: the prohibitive evaluation cost of testing prompts in a full system, the topology-induced coupling where one agent's prompt affects others' performance, and the combinatorial explosion of possible prompt combinations across all agents.

How MASPOB's Architecture Solves These Problems

The MASPOB framework is engineered to tackle each challenge systematically. To achieve sample efficiency under a strict evaluation budget, it employs a bandit optimization approach, specifically utilizing the Upper Confidence Bound (UCB) algorithm. This allows the system to balance exploring new prompts and exploiting known high-performers, maximizing gains with minimal, costly evaluations.

To address the interconnected nature of agents, MASPOB integrates Graph Neural Networks (GNNs). The GNNs learn topology-aware representations of prompt semantics, effectively modeling how prompts and their effects propagate through the specific structure of the agent network. Furthermore, the framework uses a coordinate ascent method to decompose the high-dimensional optimization problem. This breaks it down into a series of univariate sub-problems, dramatically reducing the search complexity from exponential to linear in the number of agents.

Proven Performance and Future Implications

Extensive experiments validate MASPOB's effectiveness. The research, detailed in the paper arXiv:2603.02630v1, demonstrates that MASPOB consistently outperforms existing baseline methods across diverse benchmarks. This superior performance underscores its potential as a practical tool for developers and enterprises deploying complex LLM-powered multi-agent systems in real-world scenarios where reliability and cost are paramount.

From an expert perspective, this work moves beyond treating prompt engineering as an art for single models and establishes it as a rigorous, scalable engineering discipline for system-level AI. The integration of GNNs to capture structural priors is a particularly insightful innovation, acknowledging that in multi-agent environments, context and connection are as important as content.

Why This Matters: Key Takeaways

  • Enables Practical Deployment: MASPOB provides a sample-efficient pathway to optimize locked multi-agent workflows, making advanced AI systems more reliable and performant in production.
  • Solves Systemic Complexity: It directly addresses the unique challenges of MAS—evaluation cost, agent coupling, and combinatorial search—through a unified bandit-based framework.
  • Introduces Structural Awareness: The use of Graph Neural Networks to model agent topology is a novel contribution that could influence future research on networked AI systems.
  • Boosts State-of-the-Art: Empirical results show MASPOB achieves superior performance, setting a new benchmark for prompt optimization in complex, multi-agent environments.

常见问题