How Muzzle Works v1
This is the stored snapshot for the approved document version. The diff below shows what changed from the previous version.
Document snapshot
How Muzzle Works
Muzzle is a transparent inspecting proxy. It takes over the host and port the client already uses, so you configure it once and then use the model provider normally — every request and response flows through Muzzle without any client changes. v1 covers Ollama, OpenAI, and Anthropic.
Request lifecycle
- A client sends a request to a Muzzle proxy listener (the port it used to send to the provider directly).
- The provider adapter parses the provider-native request into Muzzle's canonical model.
- The policy engine inspects the canonical request (the input phase).
- If allowed, Muzzle forwards to the configured upstream and reads the response (buffering streamed responses in full first).
- The adapter parses the response into the canonical model and the engine inspects it (the output phase), including any tool calls.
- Muzzle renders the (possibly redacted or transformed) result back into the provider-native shape and returns it — re-emitting streamed responses chunk by chunk.
Endpoints that are not part of the inspected proxy surface pass through uninspected, so unrelated provider calls keep working.
Inspection
The engine is rules-first and fast. Detectors run before any optional model call:
- Prompt injection / jailbreak — detects attempts to override instructions.
- Secrets — API keys, tokens, private keys, and similar.
- PII — emails, card numbers, and other personal identifiers (linear scanning to avoid catastrophic backtracking on long input).
- Content policy — a configurable denylist of terms.
Each detector maps to a category, and each category has a configurable action per direction (input and output):
| Action | Effect |
|---|---|
allow | pass through untouched |
log | record the match, take no other action |
redact | mask the matching spans in place |
transform | rewrite the content |
block | reject the request with a provider-style error |
Tool calls are blocked by default.
Muzzle fails closed by default: if inspection cannot complete, the request is
blocked rather than passed through. Set fail_mode: open to invert that.
The LLM judge
An optional LLM judge adds a model-based opinion on top of the rules. It is off
by default and configured under llm_judge (enabled, model, base_url).
- When it runs. Only after the hard rule gates (content-policy, secrets, plus the other rule detectors) approve a request. If the rules already block, the judge never runs.
- What it checks. Both input and output, for prompt injection, PII, and banned subjects (see below).
- Direct, unfiltered call. The judge classifies by calling its own model
endpoint (
llm_judge.base_url) directly, which bypasses Muzzle's inspection. So the text being judged — and the judge's own reply — are never re-filtered by the input/output policies, and there is no recursion. Pointbase_urlat a real model server, never at Muzzle's own listener. - Actions. A judge finding maps to the configured action: prompt-injection and
PII use the per-direction
policiesactions; each subject uses its own action.blockrejects the request;log/allowlet it through (judge findings are classifications, not spans, soredact/transformare recorded but not applied). - Failure. If the judge model is unreachable, Muzzle honors
fail_mode(closed → block, open → allow) and logs the outcome.
Subjects (topic policy)
content_rules.subjects is a list of banned topics, each with its own action, e.g.
{ name: "weapons", action: block }. When the judge is enabled it determines whether
the input or output is about any configured subject and applies that subject's action.
A blocked request returns a provider-style error, e.g. for Ollama:
{"error": "muzzle rejection: <reason> (log#<ref>)"}. The log#<ref> ties the
rejection to a line in the decision log.
Architecture
- Canonical model + provider adapters. Each adapter (Ollama, OpenAI, Anthropic) parses provider-native requests/responses into one canonical request/response/ event/tool-call model and renders canonical results back out. The policy engine runs once against the canonical form, so all providers are inspected identically and new providers are additive work.
- Per-listener routing. Muzzle is configured via a YAML file. Each listener binds
a
host:portand has akind: aproxylistener routes to one configured upstream; anadminlistener serves the local admin UI. Multiple upstreams are supported by running multiple proxy listeners. - Streaming. Streamed responses are buffered, inspected as a whole, then re-emitted preserving chunk ordering, so policy applies to the complete response while the client still sees a stream.
- Decision logging. Every decision is written as a JSON line to the configured
logging.decisionssink (a file path orstdout), with the upstream, stage, action, categories, redaction count, reason, and a shortref.
Configuration
The YAML config has these sections:
listeners— list of{ bind, kind, upstream }. Proxy listeners require anupstream; admin listeners ignore it.upstreams— map of name to{ provider, base_url }.policies.default—inputandoutputmaps of category → action.policies.overrides— per-upstream policy sets that replace the default for that upstream.fail_mode—closed(default) oropen.llm_judge—{ enabled, model, base_url }.secrets—{ mode, file }.modeisrules(built-in regex, default),file(exact values from the encrypted file), orboth.content_rules.denylist_terms/subjects— inline lists (apply to both directions), plus file references below.logging—{ level, decisions }.
List & secrets files
To keep the YAML small, the bulk lists live in files (one entry per line), with a separate file per direction. The config holds the paths; the entries live in the files.
secrets.file— encrypted file of literal secret values, one per line. It is encrypted with a generated Fernet key at<file>.key(chmod 600, owned by the service user). Never hand-edit the ciphertext — use the CLI or admin portal, which decrypt in memory and re-encrypt on save.
The installer pre-creates all of these under /etc/muzzle/ (empty list files, the
secrets key, and an empty encrypted secrets file), wires their paths into the
config, and chowns them to the service user — so they exist and are editable from
the admin Files tab right after install. The engine also tolerates missing files,
treating them as empty.
content_rules.content_policy_files.{input,output}— plain term files for the content-policy detector, per direction.content_rules.subject_files.{input,output}— plain subject files, per direction; each line isname: action(action defaults toblock).
Edit them with muzzle secrets|terms|subjects … or in the admin portal. Files are
loaded when the config reloads, so the CLI restarts the service after an edit.
Operating a v1 install
The installed build runs as a systemd service and exposes an admin listener in the same process, so operators can manage it without leaving the VM.
- Admin UI (on the admin listener) — a tabbed page: a Configuration form (General, Upstreams, Listeners, Policies with default + per-upstream overrides, Content rules) where rows can be added and removed; an Advanced YAML tab for raw editing; a Logs tab that live-tails the decision log with filters; and a Simulation tab. Saving validates the config and reloads the running service.
muzzleCLI (on PATH) —validate,edit,status,restart,logs(with--follow,--tail, and filters),add(denylist terms),input/output(set default policy actions),upstream list|remove|add(plus amuzzle HOST:PORT NAMEshorthand to add one), andsimulate. With no subcommand,muzzleruns the proxy. The config path comes from--config/-corMUZZLE_CONFIG.- Install/uninstall —
install.shwrites/opt/muzzle/v1,/etc/muzzle/ muzzle.yaml, the systemd unit,/var/log/muzzle/decisions.jsonl, the/usr/local/bin/muzzlewrapper, and a virtualenv with all dependencies.uninstall.shremoves that full footprint.
Simulation
- The CLI (
muzzle simulate) and the admin UI both dry-run policy decisions against the exact same engine the proxy uses in production. - Simulation accepts provider-native payloads and reports the resulting decision, redactions, and tool-call handling without forwarding anything upstream — so a preview matches what live traffic would do.
Docs workflow
- The living files in
products/muzzleremain the source of truth. - An approved export step writes a fresh immutable MongoDB snapshot for each document, stamped with time, source path, commit, and content digest.
- The API serves the latest approved snapshot for each doc.
- The website can browse document history and diffs for each document version.
More
- Getting started and v1 scope:
products/muzzle/v1/README.md - Full design:
docs/plans/2026-06-23-muzzle-v1-design.md - Enterprise proxy core design:
docs/plans/2026-06-24-muzzle-enterprise-proxy-core.md