Edge architecture · a comparative study · 2026

Five ways to put
Kubernetes. at the edge.

A quantitative comparison of edge Kubernetes deployment topologies for large distributed infrastructures — built around a simulator you can drive with your own parameters.

§ 01 · Context

The decision that cascades.

A distributed edge deployment — thousands of small sites, each with a handful of servers, coordinated by a central management cluster — forces one question first: where does the Kubernetes control plane live? Five topologies are credible, each distributing control-plane components differently between the edge sites and the central cluster.

The trade-offs between robustness, latency, cost, and operational complexity aren't reducible to a single number — but they can be made quantitative. This page formalises the parameters, defines the metrics, and provides a live simulator. The simulator is the centerpiece: drive the parameters, watch the metrics shift.

§ 02 · The five models

Each topology places control-plane components differently.

01

Fully autonomous

Each edge site is a self-contained Kubernetes cluster with its own redundant control plane on dedicated master servers, plus workers. Maximum robustness, maximum overhead.

local control plane dedicated masters high overhead
02

Shared autonomous

Each site is self-contained, but master and worker components are co-located on the same servers. Same robustness, less hardware waste, identical operational complexity.

local control plane co-located medium overhead
03

Hosted Control Plane

Each tenant control plane runs as pods inside a central management cluster. Edge sites contain workers only. The hyperscaler-style hosted control plane.

hosted CP no edge masters low overhead
04

Distributed autonomous

A hybrid: one master runs locally as a fallback while the rest of the control plane runs remotely. Mixes patterns in a way that fails to maintain control-plane quorum during a partition.

hybrid no quorum on partition questionable
05

Gigantic cluster

A single Kubernetes cluster with masters in the cloud and every edge server as a worker. Operationally simple but hits scalability ceilings and exposes the maximum blast radius.

single cluster hits K8s limits huge blast radius
§ 03 · Simulator

Drive the parameters.

Sliders update the metrics in real time. Cells turn warning-coloured or danger-coloured when a model crosses a meaningful threshold for the chosen deployment scale.

Parameters

200
8
3
3
10
0.70
0.5
20.0
€10k
200 W

Metrics

ModelCP %Fail-okWAN-okLat msBlastMax/clOpEx/yEnergy/y
(1) Fully autonomous37.610.508€3.0M1.1 GWh
(2) Shared autonomous7.710.508€2.6M215 MWh
(3) Headless (Kamaji)3.8120.52008€1.0M110 MWh
(4) Distributed autonomous15.806.52008€1.2M461 MWh
(5) Gigantic cluster0.2120.52001,600€30k5 MWh
good warning dangerops scales · M1=1.5 · M2=1.3 · M3=0.5 · M4=0.6 · M5=3.0

Columns

CP %
Control-plane resources as a percentage of total compute (control overhead). Dedicated master servers count fully; co-located ones count at fcp=0.2.
Fail-ok
Local server failures the site can absorb before its control plane loses write availability.
WAN-ok
Whether the edge site keeps a writable control plane during a WAN partition with the central cluster.
Lat ms
Effective round-trip latency for an API call from the site to a control-plane server.
Blast
Number of edge sites disrupted by a failure of the central management cluster.
Max/cl
Maximum nodes joined to a single Kubernetes cluster. The practical scalability ceiling sits around 5,000.
OpEx/y
Annual operations cost = clusters × ops-scale × Cops. Ops-scale reflects per-cluster operational difficulty.
Energy/y
Annual energy of the master infrastructure = master-nodes-equivalent × Pmaster × 24 h × 365.
§ 04 · Parameters

The ten knobs.

N
Edge sites
1 – 5,000
200
Number of edge sites in the deployment
S
Servers per site
2 – 30
8
Servers at each edge site (assumed uniform)
mc
Central masters
1 – 7
3
Master nodes in the central management cluster
k
CP replicas
1 – 5
3
Replication factor of each hosted tenant control plane
d
CP density
1 – 20
10
Tenant control planes packed per central server (Kamaji)
α
Local cache hit
0 – 1
0.70
Fraction of API operations servable within a local site
Ll
Local latency
0.1 – 5 ms
0.5
Intra-site latency to nearest server
Lw
WAN latency
1 – 100 ms
20
Edge-to-central WAN latency
Cops
Ops cost
€1k – 50k / y
€10k
Operational cost per cluster per year
Pmaster
Master power
100 – 400 W
200 W
Average power draw of a master server
§ 05 · Metrics

Eight derived quantities.

01

Total master footpr

Meq = (mded + fcp·msharedN + mc + ⌈(N·k) ÷ d
        = Me·N + mc + Mhosting

Total master footprint across the deployment.
mded is the amount of dedicated server for master nodes, while mshared is the count of servers being both master and worker nodes. Me is the footprint per edge site (co-located masters count at fcp = 0.2). Mhosting represents the central servers needed to host tenant control planes.

02

Control plane overhead

O = Meq ÷ (N·S + mc + Mhosting) × 100

Master server-equivalents as a percentage of total compute servers.
Note: we are assuming the central cluster only has exactly the master nodes and the workers needed to run tenant control planes and no additional static workers.

03

Local failures tolerated

⌊(Me − 1) ÷ 2⌋ for local quorum ; ⌊(mc − 1) ÷ 2⌋ for central

Server failures the site can absorb before its control plane loses write availability.

04

WAN survival

survives ⇔ local control plane has quorum

Whether the edge site keeps a writable control plane during a WAN partition.

05

Effective CP latency

Leff = Ll (local) ; Ll + Lw (remote) · Ll + (1 − αLw (hybrid)

Round-trip to a control-plane API server, accounting for an optional local cache.

06

Blast radius

sites disrupted by central failure : 0 (autonomous) or N (hosted)

How many edge sites lose control-plane writes if the central cluster fails.

07

Max nodes per cluster

S (distributed) ; N·S (gigantic)

Hits Kubernetes' practical scalability ceiling around 5,000 nodes.

08

OpEx per year

OpEx = clusters × ops-scale × Cops

Ops scales encode per-cluster operational difficulty: Kamaji-managed clusters are homogeneous (0.5–0.6), bespoke clusters are heavy (1.3–1.5), one hard gigantic cluster is heaviest of all (3.0).

09

Energy per year

E = Meq × Pmaster × 24 h × 365

Energy consumed by control-plane infrastructure alone (excludes worker load).

§ 06 · Limitations

What this model deliberately doesn't capture.

The model assumes uniform edge sites (real deployments rarely are), ignores correlated failures (a fibre cut takes down many sites at once), doesn't model storage and has no Availability Zone concept which usually applies to cloud deployments.