Case Studies

Projects that blend reliability, automation, and thoughtful developer experience.

I enjoy combining pragmatic tooling with opinionated practices. Below is a sample of hands-on delivery work, internal platform initiatives, and experiments from the homelab.

At-a-glance

  • 10+ years building infrastructure platforms
  • Advocate for GitOps, Kubernetes, and automation-first operating models
  • Enjoy mentoring teams on observability and incident response

Platform delivery

Platforms that balance velocity and reliability, built with empathetic tooling and guardrails.

Kubernetes Platform

Managed multi-tenant clusters

Designed a managed Kubernetes footprint with composable add-ons, GitOps promotion, and golden-path templates powering dozens of services.

Kubernetes FluxCD Terraform
  • Provisioned day-2 automation that reduced cluster bootstrap time from days to under an hour.
  • Introduced workload standards with policy-as-code and self-service guardrails.
Delivery Tooling

Opinionated CI/CD pipelines

Led a migration to reusable pipelines with built-in quality gates, progressive delivery strategies, and drift detection for infrastructure code.

GitLab CI Argo Rollouts Policy-as-code
  • Provided paved-road templates adopted by product squads in under two sprints.
  • Implemented automated rollback with SLO-aware deployment gates.

Observability & operations

Operational insights that tighten feedback loops and boost confidence during incidents.

Telemetry

Unified observability stack

Consolidated metrics, tracing, and logging with SLO reporting and actionable runbooks integrated into on-call rotations.

Grafana OpenTelemetry Loki
  • Cut mean time to detect by introducing shared dashboards and hypothesis-driven alerting.
  • Automated incident timelines and retros with event streams.
Resilience

Chaos & resilience practice

Introduced chaos days with lightweight tooling to validate failure scenarios and strengthen incident response muscle memory.

Gremlin GameDays SLOs
  • Partnered with product teams to prioritise test scenarios aligning to customer promises.
  • Fed learnings back into automation and runbooks to reduce alert fatigue.

Lab notes

Personal experiments that keep skills sharp and ideas flowing.

Browse GitHub

Proxmox & Talos hybrid

Combining bare-metal Proxmox with Talos-managed Kubernetes for efficient home lab orchestration.

Infrastructure

Automated network failover

Dual-WAN gateway automation with Ansible and health checks to keep the homelab resilient.

Networking

Obs stack in a box

Lightweight Grafana + Loki + Tempo bundle scripted for quick spin-up during incident postmortems.

Observability

Looking for a hand with your platform roadmap?

Happy to talk through a tricky migration, automation gap, or observability challenge.