Skip to main content

Autoscaling

As of Cosmonic Control v0.6.0, both HTTPTrigger and WorkloadDeployment expose the Kubernetes /scale subresource, so a stock HorizontalPodAutoscaler (HPA) or KEDA ScaledObject can target either resource directly.

Jeremy Fleitz walks through the design and a working demo of WorkloadDeployment autoscaling on the wasmCloud community call:

Targeting an HTTPTrigger

HTTPTrigger.status.currentReplicas and HTTPTrigger.status.selector are mirrored from the backing WorkloadDeployment on every reconciliation, so HPA reads the live replica count without targeting the underlying CRD. HPA writes back to spec.replicas, which the operator propagates to the workload on the next reconciliation.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: hello-world
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: control.cosmonic.io/v1alpha1
    kind: HTTPTrigger
    name: hello-world
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: External
      external:
        metric:
          name: http_requests_per_second
        target:
          type: AverageValue
          averageValue: '10'

WorkloadDeployment exposes the same subresource shape, so the manifest above works against kind: WorkloadDeployment for non-HTTP workloads.

Don't set spec.replicas on a trigger that an HPA already manages — pick one source of truth so the two controllers don't fight over the field.

Mechanics, metric sources, and KEDA

The /scale subresource on WorkloadDeployment (and now HTTPTrigger) is the same one that backs kubectl scale, HPA, and KEDA. The upstream wasmCloud Autoscaling guide covers the mechanics in full, including:

  • Why HPA's built-in Resource (CPU/memory) and Pods metric types don't apply to Wasm workloads, and how to use External or Object metrics via the Prometheus Adapter instead.
  • The host group as a precondition: HPA holds at host-group capacity, not at maxReplicas.
  • A worked KEDA ScaledObject example that scales on a Prometheus query of metrics emitted by the workload itself.
  • The runtime.wasmcloud.dev/workload-deployment selector label.
  • Scale-to-zero, cold-start, and other cases where autoscaling is not the right answer.

Everything in the upstream guide applies to HTTPTrigger unchanged — substitute the scaleTargetRef block from the example above when the trigger is the scale target.