Muzzle help & setup guide
Muzzle is a transparent inspecting proxy for Ollama, OpenAI, and Anthropic. Point your clients at Muzzle instead of the provider, and it checks every request, response, and tool call against your policy in between. This guide covers install, day-to-day use, configuration, the muzzle CLI, and uninstall.
Overview
Muzzle takes over the host and port your client already uses, so nothing in your app changes. Requests flow through a rules-first engine — prompt injection, secrets, PII, and a content denylist — and each category has a per-direction action: allow, log, redact, transform, or block. Tool calls are blocked by default.
Muzzle fails closed by default: if inspection cannot complete, the request is blocked rather than passed through. An installed build runs as a systemd service and also serves a local admin UI on a second listener.
- Providers: Ollama, OpenAI, and Anthropic, normalized into one policy engine.
- Streaming responses are buffered, inspected, then re-emitted in order.
- Every decision is written to a JSONL decision log you can tail.
Install (Linux + systemd)
Run the installer from the v1 directory on a root-capable Linux host. It installs the source tree, a default config, a systemd unit, and a Python virtualenv with all dependencies, then offers to enable and start the service.
sudo bash install.sh
# the installer then prompts to run:
systemctl daemon-reload
systemctl enable muzzle
systemctl start muzzle
systemctl status muzzle- Install tree: /opt/muzzle/v1 (with its own .venv).
- Config: /etc/muzzle/muzzle.yaml (a default is written on first install).
- Service: /etc/systemd/system/muzzle.service, running as the muzzle user.
- Decision log: /var/log/muzzle/decisions.jsonl.
- CLI wrapper: /usr/local/bin/muzzle.
Use it (transparent setup)
The goal is for clients to keep talking to the same address. Move the real provider to a private port, point Muzzle's upstream at it, and keep Muzzle on the port your clients use. For Ollama that is :11434.
- Move the real Ollama aside so Muzzle can take its port.
- Set the upstream base_url to the real provider and keep the proxy listener on the client-facing port.
- Start Muzzle (the systemd service, or python -m muzzle for a foreground run).
- Use the provider normally — clients hitting the proxy port now flow through Muzzle.
# 1) run the real Ollama on a private port
OLLAMA_HOST=127.0.0.1:11435 ollama serve
# 2) point muzzle's upstream at it, keep the listener on :11434
# (see Configure below), then run muzzle:
MUZZLE_CONFIG=/etc/muzzle/muzzle.yaml muzzle
# 3) clients keep using :11434 — now inspected
ollama run llama3 'hello'- A blocked request returns a provider-style error, e.g. Ollama: {"error": "muzzle rejection: <reason> (log#<ref>)"}.
- The log#<ref> ties the rejection to a line in the decision log.
Configure
Muzzle is driven by a single YAML file (default /etc/muzzle/muzzle.yaml). Listeners bind ports and route to upstreams; policies set per-category actions for input and output; content_rules holds the denylist. Edit it directly, with the muzzle CLI, or in the admin UI form.
listeners:
- bind: "0.0.0.0:11434" # client-facing proxy port
upstream: ollama-local
- bind: "0.0.0.0:11435" # local admin UI
kind: admin
upstreams:
ollama-local:
provider: ollama
base_url: "http://127.0.0.1:11435"
fail_mode: closed # or: open
policies:
default:
input: { prompt_injection: block, secrets: redact, pii: redact }
output: { content_policy: block, secrets: block, pii: redact }
overrides: {} # per-upstream policy sets
content_rules:
denylist_terms: []
logging:
level: info
decisions: /var/log/muzzle/decisions.jsonl- Admin UI: a second (admin) listener serves a form editor with add/remove rows for upstreams, listeners, policy overrides, and denylist terms, plus an advanced YAML tab, a live decision-log view, and a policy simulator. Saving validates and reloads the running service.
- Per-upstream overrides replace the default policy for that upstream.
- Big lists live in files to keep the YAML small: an encrypted secrets file (secrets.mode = rules | file | both), and per-direction content-policy and subject files. Edit them with the muzzle CLI (below) or the admin portal — never hand-edit the encrypted secrets ciphertext.
The LLM judge (optional)
The judge adds a model-based opinion on top of the rules. When enabled it inspects both input and output for prompt injection, PII, and banned subjects by calling its own model endpoint directly — that call bypasses Muzzle's inspection, so the text being judged is never re-filtered and there's no recursion. It runs only after the hard rule gates (content-policy, secrets) approve a request.
- Pick a model server for the judge — a real Ollama/OpenAI/Anthropic endpoint, not Muzzle's own port.
- Set llm_judge.enabled: true with a model and base_url (base_url is required when enabled).
- Optionally add banned subjects under content_rules.subjects, each with its own action.
- Save in the admin UI (it reloads the service), or edit the config and run muzzle restart.
- Confirm with muzzle status, then watch decisions with muzzle logs --follow.
llm_judge:
enabled: true
model: "llama3"
base_url: "http://127.0.0.1:11435" # a real model server, not Muzzle
content_rules:
subjects:
- name: "weapons"
action: block
- name: "medical advice"
action: logmuzzle status # shows llm_judge enabled + subjects count
muzzle logs --follow --tail 50 # input-judge / output-judge verdicts- Actions: prompt-injection and PII use your per-direction policy actions; each subject uses its own. block rejects the request; log/allow let it through.
- If the judge model is unreachable, Muzzle honors fail_mode (closed = block, open = allow).
The muzzle CLI
The installer puts a muzzle command on PATH. It edits the installed config and, where relevant, reloads the service. The config path comes from --config/-c or the MUZZLE_CONFIG environment variable. With no subcommand, muzzle runs the proxy.
muzzle validate # check the config is valid
muzzle status # summary of listeners, upstreams, policy
muzzle edit # open the config in $EDITORmuzzle upstream list
muzzle upstream add https://api.openai.com/v1 openai-main
muzzle upstream remove openai-main
# shorthand to add an Ollama upstream by host:port and name:
muzzle 10.10.11.251:11434 ollama-remotemuzzle input --prompt-injection block --secrets redact --pii redact
muzzle output --content-policy block --secrets block --pii redactmuzzle add --input-content "glorious-day"
muzzle add --output-content "internal-codename"# encrypted secrets file (mode: rules | file | both)
muzzle secrets mode both
muzzle secrets add "sk-my-real-key-value"
muzzle secrets list
# per-direction content-policy term files
muzzle terms add "glorious-day" --input
muzzle terms list --output
# per-direction subject files (action defaults to block)
muzzle subjects add weapons --action block --input
muzzle subjects add "medical advice" --action log --outputmuzzle logs --follow --tail 50
muzzle logs --action block --upstream ollama-local
muzzle logs --stage output --contains secretsmuzzle simulate --upstream ollama-local --phase output --endpoint chat --file response.json
# or pipe a payload on stdin:
echo '{"messages":[{"role":"user","content":"hi"}]}' | \
muzzle simulate --upstream ollama-local --phase inputmuzzle restartUninstall
The uninstaller removes the full local footprint — the install tree, config, logs, CLI wrapper, and service artifacts.
sudo bash uninstall.shFull living documents
Browse historyThis guide is curated. The canonical README, HOWITWORKS, plain-English, and help-desk documents — with full version history and diffs — live in the product's living documents.