Architecture¶

System Overview¶

The Idun Agent Platform architecture consists of three layers working together to provide a complete agent deployment and management solution:

┌─────────────────────────────────────────────┐
│           Client Applications               │
│     (Web, Mobile, CLI, API consumers)       │
└────────────────┬────────────────────────────┘
                 │ HTTP/REST
                 ▼
┌─────────────────────────────────────────────┐
│         Idun Agent Manager (Optional)       │
│  ┌─────────────────────────────────────┐   │
│  │  REST API │ Web UI │ CLI │ Auth     │   │
│  └─────────────────────────────────────┘   │
│  ┌─────────────────────────────────────┐   │
│  │  PostgreSQL Database                │   │
│  │  (Configs, API Keys, Metadata)      │   │
│  └─────────────────────────────────────┘   │
└────────────────┬────────────────────────────┘
                 │ Config Retrieval via API Key
                 ▼
┌─────────────────────────────────────────────┐
│           Idun Agent Engine                 │
│  ┌─────────────────────────────────────┐   │
│  │  FastAPI Server │ Config Loader     │   │
│  └─────────────────────────────────────┘   │
│  ┌─────────────────────────────────────┐   │
│  │  Observability │ Guardrails │ MCP   │   │
│  └─────────────────────────────────────┘   │
│  ┌─────────────────────────────────────┐   │
│  │  Framework Adapters                 │   │
│  │  LangGraph │ Haystack │ ADK        │   │
│  └─────────────────────────────────────┘   │
└────────────────┬────────────────────────────┘
                 │ Agent Invocation
                 ▼
┌─────────────────────────────────────────────┐
│      External Services & Storage            │
│  LLMs │ Databases │ Vector Stores │ APIs   │
└─────────────────────────────────────────────┘

Component Details¶

Engine (Runtime Layer)¶

The Engine is the runtime execution environment for AI agents. It's a FastAPI application that loads agent code, manages framework-specific adapters, and provides a unified REST API.

Application Factory (create_app)

Creates and configures the FastAPI application instance: - Accepts configuration from three sources: EngineConfig object (highest priority), configuration dictionary, or YAML file path - Configures middleware (CORS, error handlers) - Registers routes (/agent/invoke, /agent/stream, /healthz, /readyz) - Sets up dependency injection for config and agent instances - Handles graceful shutdown and cleanup

Configuration System

Three-tier resolution priority: 1. EngineConfig object passed directly to create_app() (highest priority) 2. Configuration dictionary (Python dict) 3. File path to YAML configuration (lowest priority)

Environment variable substitution occurs at runtime, replacing ${VAR_NAME} with actual values. Pydantic validation ensures all required fields are present and types match before agent initialization.

Server Runner (run_server)

Uvicorn-based server launcher: - Hot-reload support for development - Multi-worker configuration for production - Graceful shutdown with signal handling - Configurable host, port, and logging levels

Agent Adapters

Framework-specific implementations of the BaseAgent protocol: - LangGraph Adapter: Loads compiled graphs, manages checkpointing (SQLite/PostgreSQL/In-Memory), streams execution events - Haystack Adapter: Loads pipelines or agents, integrates with document stores and retrievers - ADK Adapter: Initializes session services (InMemory/Database/VertexAI) and memory services (InMemory/VertexAI)

Request Processing Pipeline

HTTP request received at /agent/invoke or /agent/stream
Route to appropriate endpoint handler
Load agent configuration (if not cached)
Execute input guardrails sequentially
Invoke agent via framework adapter
Execute output guardrails sequentially
Return response with observability trace IDs

Manager (Control Plane)¶

The Manager provides centralized configuration storage and multi-tenant agent hosting. It's an optional component—agents can run directly with the Engine for simpler deployments.

REST API Layer

FastAPI application with endpoints for: - Agents: CRUD operations, API key generation, configuration retrieval - Observability: CRUD for observability provider configurations - Guardrails: CRUD for guardrail validator configurations - MCP Servers: CRUD for MCP server configurations - Agent Frameworks: List available frameworks and metadata - Health Checks: /healthz and /readyz for monitoring

Database Persistence

SQLAlchemy async ORM with PostgreSQL: - ManagedAgentModel: Stores agent configurations and metadata - ManagedObservabilityModel: Stores observability provider configs - ManagedGuardrailModel: Stores guardrail validator configs - ManagedMCPServerModel: Stores MCP server configs - Includes created_at, updated_at timestamps for audit trails - Connection pooling for performance - Alembic migrations for schema changes

Authentication System

API key-based authentication: - Generates unique keys per agent: idun-{random_hash} - Cryptographically secure random generation - Stores agent_hash in database for validation - Bearer token authentication on Engine requests - Agent-specific access control (one key per agent)

Configuration Management

Stores EngineConfig as JSON in database
Version tracking via updated_at timestamps
Validates configuration before storage using Pydantic schemas
Supports environment variable placeholders in stored configs

Schema (Data Models)¶

Shared Pydantic models used by both Engine and Manager for validation and serialization.

Configuration Models

EngineConfig: Top-level configuration (server, agent, observability, guardrails, MCP)
ServerConfig: API server settings (host, port, CORS)
AgentConfig: Framework-specific agent configuration
LangGraphAgentConfig: Graph definition, checkpointer settings
HaystackAgentConfig: Component type, component definition
AdkAgentConfig: Session service, memory service, app name
ObservabilityConfig: Provider type and provider-specific settings
GuardrailsConfig: Input/output guardrail configurations
MCPServerConfig: MCP server connection details

API Models

Request/response schemas for all Engine and Manager endpoints
Error models with status codes and messages
Pagination models (limit, offset, total)

Data Flow Diagrams¶

Configuration Flow¶

Manager DB → Manager API → Engine GET /agents/config
                                    ↓
                            ConfigLoader resolves config
                                    ↓
                            Agent Initialization

When using the Manager, the Engine requests configuration via API key. When running standalone, the Engine loads configuration from a YAML file directly.

Request Flow¶

Client → Engine POST /agent/invoke → Input Guardrails
                                           ↓
                                    Agent Adapter
                                           ↓
                                    Framework Agent (LangGraph/Haystack/ADK)
                                           ↓
                            Output Guardrails ← Agent Response
                                    ↓
                            Client Response (JSON)

Streaming requests follow a similar flow but use Server-Sent Events (SSE) to stream execution events in real-time.

State Persistence Flow¶

Agent Execution → State Changes → Checkpointer/Session Service
                                         ↓
                                  Database (SQLite/PostgreSQL)
                                         ↓
                            Next Request → State Reload → Resume Execution

LangGraph uses checkpointers for state persistence. ADK uses session services. Haystack pipelines are stateless.

Observability Trace Flow¶

HTTP Request → Agent Execution → LLM Call → Tool Execution
     ↓              ↓                ↓            ↓
     └──────────────┴────────────────┴────────────┘
                         ↓
              Observability Handler (Langfuse/Phoenix/GCP)
                         ↓
              Provider Dashboard (traces, costs, metrics)

All execution steps are captured by observability handlers and sent to configured providers for monitoring and debugging.

Integration Points¶

Manager ↔ Engine¶

Configuration Retrieval Flow:

Engine starts with API key (via environment variable or command-line argument)
Engine sends GET /agents/config with Authorization: Bearer {api_key} header
Manager validates API key against database
Manager retrieves agent configuration from database
Manager returns EngineConfig as JSON
Engine deserializes config and initializes agent
Engine operates independently (no further Manager communication during requests)

Engine ↔ Framework Adapters¶

Unified Interface via BaseAgent Protocol:

All adapters implement the same protocol: - async def initialize(config, observability) - Framework-specific setup - async def invoke(message) - Synchronous request processing - async def stream(message) - Asynchronous event streaming - def infos() - Return adapter metadata

Request Translation:

Engine receives unified request format (dict with query, session_id, etc.)
Adapter translates to framework-native format (LangGraph state, Haystack input, ADK message)
Agent processes using framework's native API
Adapter translates response back to unified format
Engine returns standardized response

Engine ↔ Observability¶

Handler Attachment:

Observability handlers initialized during agent setup
Handlers registered as callbacks with framework (LangChain callbacks, Haystack tracing, etc.)
Framework invokes callbacks during execution (LLM calls, tool use, etc.)
Handlers send traces to providers (Langfuse, Phoenix, GCP)

Trace Propagation:

Root span created for HTTP request
Child spans for agent invocation, LLM calls, tool executions
Trace context propagated across async operations
Observability providers receive full trace hierarchy

Engine ↔ Guardrails¶

Validation Hooks:

Input Guardrails: Execute before agent invocation, validate user input
Output Guardrails: Execute after agent response, validate agent output
Each guardrail validator runs in sequence
If any validator fails: return HTTP 422 with custom reject_message
Validation results logged for monitoring

Engine ↔ MCP Servers¶

Lifecycle Management:

Server Startup: MCP servers launched as subprocesses when Engine initializes
Connection: Persistent stdio/HTTP/WebSocket connections established
Tool Discovery: Engine queries servers for available tools and registers them
Request Routing: Agent tool calls routed to appropriate MCP server
Health Monitoring: Engine monitors server health, restarts on failure
Shutdown: MCP servers terminated gracefully when Engine stops

Request Lifecycle (Detailed)¶

Step 1: Authentication¶

Client includes Authorization: Bearer {api_key} header (if using Manager)
Manager validates API key and returns agent configuration
Engine loads and caches configuration
Standalone mode skips this step (config loaded from file)

Step 2: Configuration Loading¶

Resolve config source (Manager API, local file, or programmatic config)
Substitute environment variables (${VAR_NAME} → actual values)
Validate against Pydantic schemas (fail fast on errors)
Initialize components (agent, observability handlers, guardrails, MCP servers)

Step 3: Input Validation¶

Execute input guardrails in sequence (e.g., ban list, PII detector)
Each validator checks input against configured rules
If validation fails:
Return HTTP 422 Unprocessable Entity
Include reject_message from guardrail config
Log validation failure with details

Step 4: Agent Invocation¶

Route request to appropriate adapter (LangGraph/Haystack/ADK)
Adapter translates request to framework-native format
Invoke agent (synchronous or streaming)
Observability handlers capture execution traces
Framework executes agent logic (LLM calls, tool use, state transitions)

Step 5: Output Validation¶

Execute output guardrails in sequence
Each validator checks agent response against rules
If validation fails:
Return HTTP 422 Unprocessable Entity
Include reject_message from guardrail config
Log validation failure

Step 6: Response¶

Format response (JSON for /agent/invoke, SSE for /agent/stream)
Include observability trace IDs for correlation
Return to client with appropriate HTTP status

State Management¶

Checkpointing (LangGraph)¶

SQLite Checkpointer: - File-based persistence (checkpoints.db) - Single-process only (no concurrent access) - Ideal for local development and testing

PostgreSQL Checkpointer: - Multi-process, production-ready - Concurrent access with connection pooling - Requires PostgreSQL database

In-Memory Checkpointer: - No persistence (state lost on restart) - Fastest performance for stateless testing

Thread Isolation: - Each session_id maps to unique thread - State isolated across conversations - Resume conversations after failures or restarts

Session Services (ADK)¶

InMemory: Development/testing, ephemeral state
Database: SQL-based persistence with SQLAlchemy
VertexAI: Cloud-native session management on Google Cloud

Memory Services (ADK)¶

InMemory: Ephemeral memory, no persistence
VertexAI: Cloud-backed memory with long-term storage

Deployment Architectures¶

Local Development¶

Developer Machine
├── Agent Code (agent.py)
├── Configuration (config.yaml)
└── idun agent serve --source=file --path=./config.yaml
    └── Engine running on http://localhost:8000

Simplest deployment for development. No database required if using in-memory checkpointing or stateless agents.

Self-Hosted (Future)¶

Infrastructure (VM/Kubernetes/Docker Compose)
├── Manager Service
│   ├── REST API (FastAPI)
│   ├── Web UI (React dashboard)
│   └── PostgreSQL Database
└── Engine Service(s)
    ├── Load Balancer (nginx/HAProxy)
    └── Multiple Engine instances (horizontal scaling)

Production deployment with centralized management. Engines fetch configurations from Manager via API keys.

Idun Cloud (Planned)¶

Managed Platform
├── Global Load Balancer (multi-region)
├── Manager (multi-region, high availability)
├── Engine Auto-scaling (based on traffic)
└── Managed PostgreSQL (automatic backups, replication)

Fully managed platform with zero infrastructure management. Automatic scaling, built-in observability, custom domains, and SSL.

Scalability Considerations¶

Async Processing¶

FastAPI built on Starlette with async/await support
Non-blocking I/O for all database and HTTP operations
Concurrent request handling without thread-per-request overhead
Efficient resource utilization under load

Connection Pooling¶

Database: SQLAlchemy connection pools for PostgreSQL (checkpointing, Manager database)
HTTP Clients: Connection reuse for observability providers and Manager API
MCP Servers: Persistent connections to avoid subprocess startup overhead

Resource Management¶

Graceful Degradation: Observability failures don't block agent execution
Circuit Breakers: Prevent cascading failures when external services are down
Timeout Handling: Configurable timeouts for LLM calls and tool executions
Memory Limits: Configurable limits to prevent runaway memory usage

Horizontal Scaling¶

Stateless Engine Instances: Multiple Engine instances can run concurrently
Load Balancer Distribution: Distribute requests across Engine instances
Shared State Storage: Checkpointing database shared across instances
Independent Scaling: Manager and Engine can scale independently based on load

Next Steps¶

Engine Concepts → - Deep dive into the runtime engine
Manager Concepts → - Learn about the control plane
Basic Configuration Guide → - Start building agents