AgentOrchestra
Loading amazing experiences...

AgentOrchestra

A Hierarchical Multi-Agent Framework for General-Purpose Task Solving

1. Introduction

Recent advances in Large Language Models (LLMs) or Large Multimodal Models (LMMs) have led to a shift from simple dialogue to models capable of performing sophisticated reasoning, enabling progress from answering straightforward questions to responding to complex, multi-step queries.

However, current LLMs remain largely disconnected from real-world environments due to the absence of interactive tool integration, which constrains their ability to perform grounded, general-purpose, and complex tasks.

AgentOrchestra addresses these challenges through a hierarchical multi-agent framework that integrates high-level planning with modular agent collaboration, inspired by the way a conductor orchestrates a symphony.

Key Principles

  • Extensibility
  • Multimodality
  • Modularity
  • Coordination

2. Architecture

AgentOrchestra Architecture

Planning Agent

Central coordinator that decomposes complex objectives and delegates sub-tasks to specialized agents.

Click to view workflow →

Deep Researcher

Conducts thorough research on specified topics, retrieving and synthesizing high-quality information.

Click to view workflow →

Browser Use

Automates browser operations, supporting web search, information extraction, and data collection.

Click to view workflow →

Deep Analyzer

Performs in-depth analysis of input information, extracting key insights and potential requirements.

Click to view workflow →

General Tool Calling

Provides a general-purpose interface for invoking various tools and APIs with function calling support.

Click to view workflow →

MCP Manager Agent

Enables intelligent tool evolution through automated creation, dynamic retrieval, and systematic reuse of MCP tools.

Click to view workflow →

MCP Manager Agent

Problem Statement

The rapid expansion of AI agent applications has led to exponential growth in the complexity and diversity of required Model Context Protocol (MCP) tools. Traditional approaches relying on manual tool development face significant challenges including development inefficiency, version inconsistency, and limited adaptability to emerging requirements.

Solution

The MCP Manager Agent addresses these limitations through intelligent tool evolution via automated creation, dynamic retrieval, and systematic reuse mechanisms. This represents a paradigm shift from static tool provisioning to adaptive tool ecosystem management.

Core Capabilities

Tool Retrieval

Keyword pre-filtering strategy to efficiently match tasks with relevant tools from the library.

Tool Creation

Automated generation of MCP-compliant tools through intent analysis, synthesis, and validation phases.

Tool Reuse

Comprehensive tool registry with persistence, versioning, and lifecycle tracking capabilities.

Tool Creation Workflow

1

Intent Analysis

Parse user task intentions and extract functional requirements, input-output specifications, and operational constraints.

2

Tool Synthesis

Generate executable MCP-compliant tool implementations with parameterized scripts and error handling.

3

Validation

Multi-stage evaluation protocol assessing tool correctness, performance characteristics, and integration compatibility.

4

Registration

Register validated tools in the system's tool registry with comprehensive metadata and usage examples.

3. Experiments

GAIA Benchmark Test Results

AgentOrchestra (Our)
Other Models
AgentOrchestra
83.06
AgentOrchestra (w/o MCP)
79.07
Aworld
81.73
Su Zero Ultra
80.40
h2oGPTe Agent
79.73
desearch
78.07
Alita
75.42
Langfun Agent
73.09
o3-deep-research
68.67
JoyAgent-Genie
65.12
o4-mini-deep-research
59.33

SimpleQA Benchmark

Evaluation on simple question-answering tasks to assess basic reasoning capabilities.

95.3

GAIA Benchmark Validation

Comprehensive evaluation on real-world tasks requiring web search and reasoning.

82.42

HLE Benchmark

Human-level evaluation benchmark for complex reasoning and planning tasks.

25.9

Key Results

Performance Improvements

  • • Consistently outperforms flat-agent and monolithic baselines
  • • Superior task success rate and adaptability
  • • Effective hierarchical organization and role specialization

Scalability Benefits

  • • Modular design enables easy integration of new agents
  • • Flexible orchestration through explicit sub-goal formulation
  • • Adaptive role allocation for dynamic task requirements

Paper & Resources

Research Paper

Read our full paper "AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task Solving" published on arXiv.

Code Repository

Access the complete implementation, examples, and documentation on GitHub.

Authors

WZ

Wentao Zhang

Skywork AI

LZ

Liang Zeng

Skywork AI

YX

Yuzhen Xiao

Skywork AI

YCL

Yongcong Li

Skywork AI

CC

Ce Cui

Skywork AI

YZ

Yilei Zhao

Nanyang Technological University

RH

Rui Hu

Skywork AI

YL

Yang Liu

Skywork AI

YHZ

Yahui Zhou

Skywork AI

BA

Bo An

Nanyang Technological University

Skywork AI