arqora / infrastructure

What runs

everything else.

The infrastructure behind Arqora's systems — what we run, why we run it that way, and the principles that shape every decision about how it's built and operated.

Infrastructure is only invisible when it's working.

Most teams treat infrastructure as a cost to minimize. Arqora treats it as a system to design — one that shapes the reliability, security, and longevity of everything built on top of it. The decisions made at the infrastructure layer are the hardest to reverse.

We don't over-engineer for scale we don't have. But we also don't under-invest in the foundations that everything else depends on. The goal is infrastructure that handles expected load comfortably, fails gracefully when pushed, and recovers quickly without manual intervention.

Principles

Philosophy

Self-hosted where it matters

Arqora doesn't default to managed cloud for everything. Where data sensitivity, cost trajectory, or long-term control justify it, we run our own hardware. Where managed services offer genuine leverage without unacceptable trade-offs, we use them.

Resilience

Designed to recover, not just run

Every service has a defined recovery path. Runbooks exist before incidents do. Backups are restored on a schedule to confirm they actually work. The question is never 'will this fail' — it's 'how fast can we recover when it does'.

Observability

Observe everything, alert on signal

Metrics are cheap. Useful alerts are hard. Arqora's monitoring is tuned to surface real degradation — not to fire on every anomaly. A noisy alert system trains people to ignore alerts.

Security

Minimal attack surface by default

Services expose only what they need to. Networks are segmented. Administrative access requires a tunnel. Every port, every credential, and every public endpoint is an explicit decision — not a default.

Reproducibility

Infrastructure as code, not clicks

Every infrastructure change is documented in version-controlled configuration. If it can't be reproduced from code, it doesn't exist in any durable sense. Click-ops are how you get systems no one understands.

Capacity

Capacity with headroom

Arqora doesn't run infrastructure at 90% utilisation. Headroom isn't waste — it's the difference between a spike being invisible and a spike being an incident. We size for comfortable operation, not theoretical maximum efficiency.

The stack

Edge / CDN

Cloudflare — DNS, DDoS mitigation, edge caching

Asset delivery and static front-end serving

Compute

Dedicated VPS (Hetzner) for primary workloads

Containerised services via Docker Compose

No Kubernetes — scope doesn't justify the overhead

Data

PostgreSQL on managed instances (Neon for serverless, self-hosted for critical)

Redis for ephemeral state and queues

Backups encrypted, versioned, and tested quarterly

Networking

Private VLANs between internal services

WireGuard for administrative access

No public SSH — everything goes through the tunnel

Observability

Structured JSON logs shipped to a central aggregator

Uptime monitoring with alerting on degradation, not just outage

Dashboards for latency, error rate, and saturation

Secrets

No secrets in environment variables at runtime — pulled from vault at startup

Rotation schedules enforced for all long-lived credentials

Audit log on all secret access

Decisions and why

Hetzner over AWS for primary compute.

The cost differential at Arqora's scale is significant, and the workloads don't need the elastic scaling that justifies AWS's pricing. Dedicated hardware with predictable cost and full control is the right trade-off here.

Docker Compose, not Kubernetes.

Kubernetes solves real problems at real scale. Arqora doesn't have those problems yet. Compose is understood by everyone on the team, debuggable without specialist knowledge, and adequate for current requirements.

WireGuard for all administrative access.

No SSH exposed to the public internet. All administrative access goes through a WireGuard tunnel. The attack surface for the most sensitive operations is minimal by design.

Neon for serverless Postgres at the edge.

For workloads that run at the edge or in serverless environments, traditional connection pooling is a problem. Neon's architecture handles this cleanly. For workloads that don't have this constraint, we self-host.

Backups tested, not just taken.

A backup that's never been restored is a hypothesis, not a guarantee. Arqora's backup verification runs on a schedule — automated restores to isolated environments confirm the data is actually recoverable before it's ever needed.

Infrastructure decisions age. What's right at one scale or one moment isn't right indefinitely. Arqora revisits these choices when the requirements change — not before, and not never.

The goal isn't a perfect system. It's a system that fails predictably, recovers quickly, and doesn't surprise the people running it.

ArqoraInfrastructure

Development Research Open-source projects Experiments

Systems operational