Observability
Cosmonic Control ships a fully integrated observability stack. No external dependencies are required.
Architecture
All components export telemetry via OTLP to the opentelemetry-collector, which fans out to:
- Prometheus — metrics
- Loki — logs
- Tempo — distributed traces
- Perses — unified dashboard over all three backends
Workload telemetry
HostGroups emit OpenTelemetry data for Wasm workloads when the --wasi-otel flag is set on the control-host container. The flag is enabled by default via the opentelemetry.workload: true value on the cosmonic-control-hostgroup chart.
With the flag enabled, workloads on the HostGroup can import the wasi:otel WIT package to emit traces, metrics, and logs:
import wasi:otel/types@0.2.0-rc.1;
import wasi:otel/tracing@0.2.0-rc.1;
import wasi:otel/logs@0.2.0-rc.1;
import wasi:otel/metrics@0.2.0-rc.1;Emitted data flows to the endpoint in opentelemetry.endpoint (default http://opentelemetry-collector:4317), which is the same OTel collector the control plane uses. Traces, metrics, and logs land in Tempo, Prometheus, and Loki and appear in the Perses dashboards alongside control-plane telemetry.
To opt out per HostGroup, set opentelemetry.workload: false:
# hostgroup-values.yaml
opentelemetry:
workload: falseTo send workload telemetry to a different collector, override opentelemetry.endpoint:
# hostgroup-values.yaml
opentelemetry:
endpoint: https://my-collector.corp.com:4317
insecure: falseSee the wasi:otel package spec and the otel-http example for reference usage.
Accessing the Perses dashboard
Perses is deployed as a ClusterIP service and is not exposed externally by default. Use kubectl port-forward to access it locally:
kubectl port-forward svc/perses 8080:8080 -n cosmonic-systemOpen http://localhost:8080 in your browser.
To expose Perses externally (for example, behind an ingress controller), change the service type in your values file:
perses:
service:
type: LoadBalancer # or NodePort, or configure your own ingressBuilt-in dashboards
Cosmonic Control provisions the following Perses dashboards automatically:
Workload Activity
Namespace, workload, and host variables drive the entire dashboard. Per-host RPS, error-rate stat with an idle empty-state, sorted per-host table, and separate collapsible HTTP, Blobstore, Keyvalue, Messaging, and Logs rows. Each TraceTable links into the Tempo Explorer with the dashboard variables pre-populated.
Host Activity
Per-host rollups of every workload running on the selected host (count, RPS, error rate, sorted table), host-process span rates (connect_nats, workload lifecycle, component prep, plugin bind/unbind), and a host-scoped logs panel filtered by k8s_pod_name.
Host Infrastructure
- Host Reconciliation Activity
- Host Controller Errors
- Workqueue Depth by Controller
Workloads
- Workload Reconciliation Rate
- Workload Errors by Type
- Active Workers by Controller
Operator Resource Usage
- Memory Usage
- CPU Usage
- Goroutines
Host identity on telemetry
Every span, log, and metric emitted by a HostGroup pod carries the following OpenTelemetry resource attributes, set on the host via the Kubernetes downward API:
k8s.pod.namek8s.pod.uidk8s.node.namek8s.namespace.namecosmonic.io/hostgroup
Use these to scope queries to a specific pod, node, or HostGroup without joining against external cluster state. The Host Activity dashboard uses k8s_pod_name (the Loki structured-metadata field copied from k8s.pod.name) as its host selector.
Accessing backends directly
Each backend is available as a ClusterIP service in the cosmonic-system namespace for direct access or integration with external tooling (e.g. an existing Grafana instance):
| Service | Port | Protocol |
|---|---|---|
prometheus | 9090 | HTTP |
loki | 3100 | HTTP (Loki API) |
tempo | 3200 | HTTP / 4317 gRPC (OTLP) |
opentelemetry-collector | 4317 (gRPC) / 4318 (HTTP) | OTLP |
To disable the built-in Perses dashboard (for example, when integrating with an existing Grafana deployment):
perses:
uiEnabled: falseCustom dashboards
Perses supports a Dashboard-as-Code approach via provisioning. Add custom dashboards with the perses.provisioning.extraProvisioningFiles Helm value. See the Perses documentation for the dashboard file format.