01 — Engineering practice · Est. 2017

Engineering for systems that have to work.

Vongoid is a senior engineering practice. We design, build, and stabilise the cloud platforms, distributed systems, and infrastructure that serious products depend on — then we hand them over, well‑documented, ready to run.

Engagements
Embedded, fixed-scope, advisory
Reply time
≤ 2 business days
Coverage
UTC−6 to UTC+1, on‑call
02

Capabilities

Six areas we work in. Each engagement is scoped around one of them, or the seam between two.

02.1

Cloud architecture

From a single VPC to multi-region platforms. Account structure, networking, identity boundaries, data residency — designed for what the system has to do in three years, not what slides well today.

02.2

Platform engineering

Internal developer platforms that turn “how do I ship this?” into a single command. Service templates, paved roads, golden paths — without locking teams into a single opinionated stack.

02.3

CI / CD pipelines

Reproducible builds, deterministic deploys, and an audit trail that holds up under scrutiny. Pipelines that survive a 3am page, not just a demo.

02.4

Infrastructure modernisation

Untangling snowflake servers, hand-rolled deployments, and ten years of accumulated workarounds. We move systems off their original authors without freezing the product roadmap.

02.5

Reliability & observability

SLOs that mean something, dashboards an on-call engineer can actually act on, and a post-mortem culture that improves the system rather than assigning blame. We treat reliability as a product surface.

02.6

Technical leadership & advisory

A senior engineer or architect embedded with your team — for a quarter, a roadmap, or a hard problem. We help you make the calls, write the ADRs, and grow the people who will inherit the system.

03

Stance

Most platforms fail not because they were built wrong, but because they were built to be demonstrated, not operated.

We design for the second year of operation — the on‑call rotation, the security review, the day the original author is on parental leave.

  1. i.

    Production‑first thinking

    Every component is sized, instrumented, and documented for the failure modes it will actually see — not the ones that look good in a diagram.

  2. ii.

    Fewer moving parts

    We pick the smallest stack that solves the problem and resist the urge to add a new service every time a requirement shifts.

  3. iii.

    Operable, not just shippable

    If a human can't triage it at 3am with a runbook and a dashboard, it isn't finished — it is half‑shipped.

  4. iv.

    Documented decisions, not just code

    We leave behind architecture decision records, post‑mortems, and a written basis for every non‑obvious choice we make.

  5. v.

    Transferable knowledge

    Our goal is that, six months after we leave, your team can change the system with confidence — and replace us without fear.

04

Method

How an engagement actually moves. Five phases, no surprises.

  1. 04.1

    Diagnose

    Read the code, read the runbooks, read the incident channel. We don't start with a proposal — we start with a shared map of the system as it actually is.

    → 1 to 2 weeks
  2. 04.2

    Design

    Architecture, sequence diagrams, an explicit list of trade‑offs. We write it down so it can be argued with, not just admired.

    → ADRs · threat model · plan
  3. 04.3

    Build

    Small, reviewable, instrumented. We work in your repositories, behind your code review, in front of your team. Nothing is shipped that your engineers can't defend.

    → merged in main
  4. 04.4

    Stabilise

    SLOs, dashboards, runbooks, on‑call training, a real post‑mortem cadence. The system earns the right to be called production.

    → SLOs · on‑call · review
  5. 04.5

    Transfer

    Written walkthroughs, recorded walkthroughs, and a deliberate handoff. We don't stay attached at the hip — we leave you more capable than we found you.

    → handover doc
05

Stack

Tools we are fluent in. Not a marketing matrix — a working inventory, grouped by what they actually do.

Cloud

  • AWS — primary
  • GCP — secondary
  • Cloudflare — edge
  • OVH · Hetzner — bare metal

Orchestration

  • Kubernetes (EKS · GKE · self)
  • ECS · Fargate
  • Nomad
  • systemd (for the ungovernable)

Infrastructure as code

  • Terraform
  • Pulumi
  • Crossplane
  • Ansible · bash

Languages

  • Go
  • Rust
  • TypeScript · Node
  • Python

Data

  • PostgreSQL · Citus
  • Kafka · NATS
  • Redis · KeyDB
  • ClickHouse · DuckDB
  • S3 · object stores

CI / CD

  • GitHub Actions
  • Buildkite
  • Argo CD · Flux
  • GitLab CI

Observability

  • OpenTelemetry
  • Prometheus · Grafana
  • Loki · Tempo
  • Datadog (when required)

Security

  • OPA · policy as code
  • HashiCorp Vault
  • SBOM · cosign · sigstore
  • SOC 2 · ISO 27001 awareness
06

Contact

Ready to talk?

Whether you need a senior architect, a platform team, or an opinionated second pair of eyes — write to us. We respond within two business days, with a real engineer on the other end.

Or, if you prefer context first, send a one‑paragraph brief — the system, the team, the problem, the deadline.