Skip to main content

Security and Non-Deterministic I/O

Traditional access-control models assume the operator can predict, at policy-design time, which resources a workload will ask for. ACLs name principals and resources; RBAC names roles and permissions. The assumption holds when application code is written by people who know what it needs to touch.

The assumption breaks when an LLM is in the loop. The model's tool calls, query patterns, and downstream actions are functions of a runtime input (a prompt) that may have been written by an attacker. The operator can no longer predict which resources the workload will ask for, because the access pattern is a non-deterministic function of an untrusted input.

This is where capability-based security becomes essential. Even when an attacker fully compromises the model's decisions, the workload can only reach capabilities the operator granted it through the manifest. That bound is static, inspectable, and unaffected by what the model decides.

Why traditional models break

When application code is written by humans, permission scope is bounded by what the code requires at design time.

When the same service has an LLM inside it generating tool calls, the access pattern is no longer derivable from the code. It is a function of:

  • The model's training distribution, which is non-deterministic given any one prompt.
  • The user's prompt, which is often untrusted.
  • Tool descriptions and intermediate outputs that feed back into the model's context window.

This is the structural reason prompt injection (Willison, 2022) and indirect prompt injection (Greshake et al., 2023) are not bugs that can be patched away. They are consequences of mixing instructions and data in a single context window. Filtering prompts and sanitizing outputs raises the bar, but does not prove a bound. The defense has to live somewhere the attacker can't reach.

Why static bounds matter

Capability-based security gives you a bound that lives outside the model's reach.

A WebAssembly component holds only the capabilities its manifest grants. The model's output can drive which of those capabilities the component invokes and with what arguments, but it cannot grow the set. If the model says "exfiltrate to attacker.com" and attacker.com is not in the allowedHosts allowlist, the request fails at the host plugin layer, not at "we hope the model doesn't try."

Two-panel diagram comparing AI workloads under ambient authority versus capability bounds. In both panels, the same chain runs top to bottom: attacker prompt feeds into an LLM, which feeds into the workload. Left panel "Ambient authority + LLM": the workload is a process with ambient authority, and solid arrows reach three targets (api.anthropic.com, attacker.com, filesystem). Right panel "Capability bounds + LLM": the workload is a Wasm component with manifest-granted capabilities, a single solid arrow reaches the granted target (api.anthropic.com), and dashed lines mark attacker.com and filesystem as unreachable. The structural defense is the manifest, which the model cannot widen.

This is the same invariant that made the Principle of Least Authority useful in 1975, applied to a new failure mode. The set of things a workload can ultimately cause to happen is fixed when the manifest is written, not when the model is called. Even a fully compromised model is bounded by the capabilities the platform team granted at deploy time.

Bounds in Cosmonic Control

A typical AI workload in Cosmonic Control might be an agent or MCP server running as a Wasm component inside a WorkloadDeployment. The manifest declares everything the workload may reach:

apiVersion: runtime.wasmcloud.dev/v1alpha1
kind: WorkloadDeployment
metadata:
  name: customer-support-agent
spec:
  replicas: 3
  template:
    spec:
      hostSelector:
        hostgroup: default
      components:
        - name: agent
          image: ghcr.io/example/support-agent:0.3.0
          localResources:
            environment:
              config:
                LOG_LEVEL: info
            allowedHosts:
              - https://api.anthropic.com
              - https://kb.support.internal:8443
      hostInterfaces:
        - namespace: wasi
          package: keyvalue
          version: 0.2.0-draft
          interfaces:
            - store
          config:
            bucket: support-conversations

The agent component can:

  • Call the Anthropic API (api.anthropic.com), needed for the model.
  • Call the internal knowledge base (kb.support.internal:8443).
  • Read and write conversation state in the support-conversations keyvalue bucket.
  • Read its LOG_LEVEL config value.

Whatever a customer's prompt convinces the model to ask for, the agent cannot:

  • Make a request to attacker.com, an internal admin API, or any host not in allowedHosts.
  • Read another tenant's conversations or any other keyvalue bucket.
  • Read environment variables outside its declared config.
  • Touch the filesystem, spawn a process, or open arbitrary TCP connections.

The attack surface is exactly the manifest. A reviewer reading the YAML can see, statically, the upper bound of what this agent can do — without trusting the component image, the model, or any chain of prompts that reaches the workload at runtime.

See Sandbox AI for the applied tutorial: deploying an MCP server as a capability-bound Wasm component on Cosmonic Control.