Services

What Ibuild.

AI on your GPUs, the cluster underneath, and the apps on top.

The three things I do.

Private cloud and hybrid infra

The cluster underneath the AI.

Talos Linux + Kubernetes clusters
FluxCD GitOps pipelines
Proxmox virtualisation and GPU passthrough
Grafana, Prometheus, Loki monitoring
Self-hosted GitLab and CI runners

Custom apps

Software for what off-the-shelf can't do.

FastAPI services and internal tools
Next.js dashboards and admin panels
n8n workflow automation
Self-hosted SaaS replacements
API integrations between your existing tools
Foundation

On-prem AI

LLMs running on your GPUs.

vLLM and Ollama deployments
RAG with local vector stores
Fine-tuning on your hardware
OpenAI-compatible routing
GPU monitoring and capacity planning
The actual stack

No magic.
Just the tools that work.

Inference
vLLMOllamaLMCache
Models
MistralMiniMaxQwen
Orchestration
Talos LinuxKubernetesFluxCDHelm
Virtualisation
ProxmoxGPU passthroughCloud-init
Automation
OpenTofuAnsiblen8n
CI/CD
GitLab self-hostedGitLab CI
Observability
GrafanaPrometheusLokiClickHouse
App layer
FastAPINext.jsPostgresRedis
Privacy
Erebus PII filterHeadscale

What that looks like in practice.

Internal Q&A on private docs

RAG over your wikis, local embeddings, nothing leaves the network.

Chat UIs for client-facing teams

LibreChat or OpenWebUI with PII filtering and audit logs.

Document processing pipelines

Extract, classify, summarise. Runs on a single GPU box.

Custom Kubernetes platforms

Talos from scratch, FluxCD for everything.

GPU monitoring

Grafana dashboards that show what's happening, alerts before a model OOMs.

Sovereign AI for EU clients

Hosted inside the EU, on your hardware or colo.

Got something to build?

Tell me what you need. I'll tell you honestly whether I can help.

Start the conversation