Skip to main content

v0.5.0 — Smart Routing & Cost Optimization ✅

Intelligent model selection and cost control across local and cloud.

Status: Completed — all 12 features implemented and tested (201 tests passing).

New Modules

ModuleDescription
core/router.pyTaskRouter with 6 routing strategies + TaskComplexityClassifier
core/usage.pyUsageTracker with budget enforcement (task/session/day)
core/health.pyHealthMonitor with sliding-window latency + error rate tracking
core/benchmark.pyBenchmarkSuite for comparing models on latency/throughput/cost

Features

Local (Ollama)

IDFeatureStatusDetail
ROUTE-01Local-first routingTaskRouter with LOCAL_FIRST strategy — always tries Ollama first
ROUTE-02Model benchmarkingBenchmarkSuite.compare_models() runs same task across providers
ROUTE-03Ollama health monitoringHealthMonitor tracks latency, error rate, availability per provider
ROUTE-04Auto-model selectionTaskComplexityClassifier + CAPABILITY_BASED routing matches task to model

Cloud (OpenRouter)

IDFeatureStatusDetail
ROUTE-05Cost budgetsUsageTracker.check_budget() with BudgetConfig (task/session/day limits)
ROUTE-06Fallback chainsFALLBACK_CHAIN strategy skips unhealthy providers in configured order
ROUTE-07Provider health monitoringHealthMonitor with sliding window, consecutive error tracking
ROUTE-08Cost dashboardUsageTracker.get_cost_breakdown() — local vs cloud split, by-provider/agent
ROUTE-09Model price comparisonBenchmarkSuite results include cost_usd per benchmark run

Hybrid

IDFeatureStatusDetail
ROUTE-10Complexity-based routingCOMPLEXITY_BASED strategy: low→local, medium→mid-tier, high→top-tier
ROUTE-11Automatic failoverRouter checks HealthMonitor.is_available() before selecting provider
ROUTE-12Split executionInterface stub — delegates to complexity_based routing for now

Key APIs

TaskRouter

from agent_orchestrator.core.router import TaskRouter, RouterConfig, RoutingStrategy

router = TaskRouter(
providers={"local-ollama": ollama, "openrouter": cloud, "anthropic": claude},
health_monitor=health,
config=RouterConfig(strategy=RoutingStrategy.COMPLEXITY_BASED),
)
provider = router.route("build a REST API with auth")

UsageTracker

from agent_orchestrator.core.usage import UsageTracker, UsageRecord, BudgetConfig

tracker = UsageTracker()
tracker.record(UsageRecord(provider="openrouter", model="qwen", input_tokens=1000, output_tokens=500, cost_usd=0.01))
status = tracker.check_budget(BudgetConfig(max_per_session=1.0))
breakdown = tracker.get_cost_breakdown() # local vs cloud split

HealthMonitor

from agent_orchestrator.core.health import HealthMonitor

monitor = HealthMonitor(max_consecutive_errors=5, error_rate_threshold=0.5)
monitor.record_success("openrouter", latency_ms=250.0)
monitor.record_error("local-ollama", "connection refused")
best = monitor.get_best_provider(["local-ollama", "openrouter"])