Services
What Ibuild.
AI on your GPUs, the cluster underneath, and the apps on top.
The three things I do.
Private cloud and hybrid infra
The cluster underneath the AI.
Talos Linux + Kubernetes clusters
FluxCD GitOps pipelines
Proxmox virtualisation and GPU passthrough
Grafana, Prometheus, Loki monitoring
Self-hosted GitLab and CI runners
Custom apps
Software for what off-the-shelf can't do.
FastAPI services and internal tools
Next.js dashboards and admin panels
n8n workflow automation
Self-hosted SaaS replacements
API integrations between your existing tools
Foundation
On-prem AI
LLMs running on your GPUs.
vLLM and Ollama deployments
RAG with local vector stores
Fine-tuning on your hardware
OpenAI-compatible routing
GPU monitoring and capacity planning
The actual stack
No magic.
Just the tools that work.
Inference
vLLMOllamaLMCache
Models
MistralMiniMaxQwen
Orchestration
Talos LinuxKubernetesFluxCDHelm
Virtualisation
ProxmoxGPU passthroughCloud-init
Automation
OpenTofuAnsiblen8n
CI/CD
GitLab self-hostedGitLab CI
Observability
GrafanaPrometheusLokiClickHouse
App layer
FastAPINext.jsPostgresRedis
Privacy
Erebus PII filterHeadscale
What that looks like in practice.
Internal Q&A on private docs
RAG over your wikis, local embeddings, nothing leaves the network.
Chat UIs for client-facing teams
LibreChat or OpenWebUI with PII filtering and audit logs.
Document processing pipelines
Extract, classify, summarise. Runs on a single GPU box.
Custom Kubernetes platforms
Talos from scratch, FluxCD for everything.
GPU monitoring
Grafana dashboards that show what's happening, alerts before a model OOMs.
Sovereign AI for EU clients
Hosted inside the EU, on your hardware or colo.
Got something to build?
Tell me what you need. I'll tell you honestly whether I can help.
Start the conversation