Skip to main content

AI Infrastructure Strategy

Budget: from ~42 EUR/month to ~625 EUR/month (scaling with revenue)

Executive Summary

Build and scale a multi-agent orchestration system based on open-source LLMs, starting with a conservative low-cost setup on AWS + OpenRouter, evolving toward a hybrid self-hosted infrastructure as revenue grows.

MetricValue
Starting budget~42 EUR/month
Scaling threshold600 EUR/month revenue
Final targetHybrid AWS + Vast.ai H200 at ~625 EUR/month

Options Evaluated

SolutionCost/monthProsCons
Claude Max $200~185 EURZero setup, Opus 4.6Rate limits, no fine-tuning
AWS + Vast.ai H200 (12h/day)~625 EURPrivate, fine-tuning, unlimitedHigh fixed cost
AWS + OpenRouter Qwen3 30B~42 EURMinimum cost, zero GPU infraPay-per-token, no fine-tuning

Key Decisions

DecisionChoiceReason
LLM ModelQwen3 30B A3BBest quality/cost for agents, open-weight
OrchestratorStateGraph (custom)Stateful workflows, provider-agnostic
Initial providerOpenRouterSingle endpoint, multi-model, zero infra
Cloud infraAWSReliability, ecosystem
GPU (Phase 4)Vast.ai H200Lowest market price, interruptible OK

Scaling Trigger

Self-hosted GPU becomes justified not for token cost savings, but for:

  1. Fine-tuning on proprietary data (impossible with OpenRouter)
  2. Total privacy (sensitive data stays in-house)
  3. Guaranteed latency without third-party dependency
  4. Custom domain-specific model