HIP-0300: Unified MCP Tools Architecture¶
Status: Draft Author: Hanzo AI Created: January 2025 Updated: January 2025
Abstract¶
HIP-0300 defines a unified architecture for Model Context Protocol (MCP) tools, consolidating 52+ individual tools into ~16 orthogonal, composable operators. The design follows Unix philosophy: each tool does one thing well, with clear composition laws.
Motivation¶
The original hanzo-mcp implementation grew organically to 52+ tools with significant overlap: - Multiple search tools (grep, search, find, ast_search) - Multiple file tools (read, write, edit, cat, head, tail) - Multiple shell tools (bash, zsh, sh, shell, cmd)
This creates confusion for LLM agents and increases cognitive load. HIP-0300 restructures tools around orthogonal axes with a formal effect lattice.
Design Philosophy¶
Core Principles¶
- Orthogonal Axes: Tools organized along independent dimensions
- Composability: Small operators that combine predictably
- Effect Tracking: Every operation has a declared effect level
- Transform vs Apply: Pure transforms produce Patches; Apply commits them
- Minimal Surface Area: 16 core operators cover all use cases
The Three Lattices¶
Three lattices constrain all operations:
EFFECT LATTICE (purity)
├─ PURE (no side effects, referentially transparent)
├─ DETERMINISTIC_EFFECT (effects, but reproducible)
└─ NONDETERMINISTIC_EFFECT (network, time, randomness)
REPRESENTATION LATTICE (data granularity)
├─ Bytes (raw content)
├─ Lines (text with line numbers)
├─ AST (parsed structure)
├─ Patch (diff/change set)
└─ Symbols (semantic references)
SCOPE LATTICE (operational boundary)
├─ Span (byte/char range)
├─ Region (line range)
├─ File (single file)
├─ Tree (directory subtree)
└─ Repo (entire repository)
Operator Surface¶
Core Operators (HIP-0300)¶
| Tool | Axis | Actions | Effect |
|---|---|---|---|
fs |
Bytes + Paths | read, write, edit, search, list, stat, patch, tree, glob | DETERMINISTIC |
id |
Identity | hash, uri, ref, verify | PURE |
code |
Symbols + Structure | parse, serialize, symbols, definition, references, transform, summarize | PURE/DETERMINISTIC |
proc |
Execution | run, bg, signal, status, wait | NONDETERMINISTIC |
vcs |
History + Diffs | status, diff, commit, log, branch, stash | DETERMINISTIC |
test |
Validation | check, build, test, detect | NONDETERMINISTIC |
net |
Network | search, fetch, download, crawl, head | NONDETERMINISTIC |
plan |
Orchestration | intent, route, compose | PURE |
Control Surfaces¶
| Tool | Surface | Actions | Effect |
|---|---|---|---|
browser |
Web DOM | navigate, click, type, screenshot, evaluate, etc. | NONDETERMINISTIC |
computer |
OS Desktop | screenshot, click, type, key, scroll, etc. | NONDETERMINISTIC |
Extended Operators¶
| Tool | Domain | Actions |
|---|---|---|
lsp |
Semantic Stream | diagnostics, code_actions, hover, completion |
memory |
Knowledge Persistence | read, write, search, create, recall |
todo |
Task Tracking | list, add, update, remove |
reasoning |
Cognition | think, critic |
agent |
Multi-Agent | run, list, status, config |
llm |
LLM Interface | chat, consensus |
Verb Kernel¶
The 27-verb kernel with typed signatures:
File Operations (fs)¶
read : Path → Bytes | Text | Lines
write : (Path, Content) → {ok, hash}
edit : (Path, Patch) → {ok, hash}
search : (Pattern, Scope) → [Match]
list : Path → [Entry]
stat : Path → {size, mtime, mode, hash}
patch : (Path, Patch, base_hash?) → {ok, new_hash}
tree : (Path, depth?) → TreeNode
glob : (Pattern, Path?) → [Path]
Identity Operations (id)¶
hash : Content → {digest, algo, size}
uri : Path → {uri, path, exists}
ref : (Path, line?, col?) → {uri, range?, hash?}
verify : (Content, Digest) → {match, actual, expected}
Code Operations (code)¶
parse : (Path | Text, lang?) → AST
serialize : AST → Text
symbols : (Path | AST, kind?) → [Symbol]
definition : (Path, Position) → [Location]
references : (Path, Position) → [Location]
transform : (Path | Text, kind, params) → Patch # PURE!
summarize : (Diff | Log | Report) → {summary, risks, next_actions}
Process Operations (proc)¶
run : Command → {stdout, stderr, code}
bg : Command → ProcessID
signal : (ProcessID, Signal) → {ok}
status : ProcessID? → [ProcessStatus]
wait : (ProcessID, timeout?) → {stdout, stderr, code}
Version Control (vcs)¶
status : () → {staged, unstaged, untracked}
diff : (ref1?, ref2?, paths?) → Diff
commit : (message, files?) → {sha, message}
log : (n?, since?, until?, path?) → [Commit]
branch : (name?, action?) → {current, branches}
stash : (action, message?) → {ok}
Validation (test)¶
check : (Path?, tool?) → {diagnostics, pass} # Lint/typecheck
build : (Path?, tool?) → {success, artifacts} # Compilation
test : (selector?, tool?) → {passed, failed, summary} # Runtime
detect : (Path?) → {test_runner, build_tool, check_tool}
Network (net)¶
search : (Query, engine?) → [{url, title, snippet}]
fetch : (URL, extract_text?) → {text, mime, status, hash}
download : (URL, dest?, assets?) → {path, size, mime}
crawl : (URL, dest, depth?, limit?) → {pages, count}
head : URL → {status, headers, size?, mime?}
Orchestration (plan)¶
Effect Annotations¶
Every action declares its effect level:
class Effect(Enum):
PURE = "pure" # No side effects
DETERMINISTIC = "deterministic" # Effects, reproducible
NONDETERMINISTIC = "nondeterministic" # Network/time/random
# Examples:
# fs.read → DETERMINISTIC_EFFECT (reads disk)
# code.parse → PURE (in-memory transform)
# code.transform → PURE (produces Patch value)
# fs.patch → DETERMINISTIC_EFFECT (writes disk)
# net.fetch → NONDETERMINISTIC_EFFECT (network I/O)
Transform vs Apply¶
Critical separation of concerns:
Transform (PURE) Apply (EFFECT)
───────────────── ────────────────
code.transform fs.patch
→ Patch value → committed change
→ base_hash → requires base_hash precondition
→ preview-safe → point of no return
vcs.diff vcs.commit
→ Diff value → committed history
Content-Addressable Edits¶
All mutations use content-addressable storage for safety:
# Generate a transform
result = await code.call(ctx, action="transform",
path="/src/main.py",
kind="rename",
old_name="foo",
new_name="bar")
# Returns: {patch: [...], base_hash: "sha256:abc...", new_hash: "sha256:def..."}
# Apply with precondition
result = await fs.call(ctx, action="patch",
path="/src/main.py",
patch=patch,
base_hash="sha256:abc...") # Must match current
# Fails if file changed since transform was computed
Intent Routing¶
Natural language maps to canonical operator chains:
Intent IR Structure¶
class IntentIR:
category: str # navigate, explain, modify, validate, debug, create
action: str # find, understand, rename, test, trace, add
target: str # extracted entity/path/concept
confidence: float
Canonical Chains¶
| Intent Pattern | Canonical Chain |
|---|---|
| "find X" | fs.search(X) |
| "what is X" | fs.search(X) → code.summarize |
| "rename X to Y" | code.references(X) → code.transform(rename) → code.summarize → [policy] → fs.patch → test.run |
| "fix bug in X" | fs.read(X) → code.parse → code.transform(fix) → [policy] → fs.patch → test.run |
| "add tests for X" | code.symbols(X) → code.transform(add_tests) → fs.write → test.run |
| "why does X fail" | test.run(X) → vcs.log → code.summarize |
| "refactor X" | code.references(X) → code.transform → code.summarize → [policy] → fs.patch → test.run |
Policy Gates¶
High-risk operations require explicit approval:
- fs.patch (file modifications)
- fs.write (new file creation)
- proc.run with shell=True
- vcs.commit / vcs.push
Validation Loops¶
Three distinct validation operations (Vim-inspired):
CHECK (:make equivalent)¶
- Fast, incremental feedback
- Lint, typecheck, format check
- Tools: ruff, mypy, eslint, tsc, clippy
- Returns:
{diagnostics: [], pass: bool}
BUILD (:!make equivalent)¶
- Whole-project compilation
- Dependency resolution
- Tools: pip, npm, cargo, go build, make
- Returns:
{success: bool, artifacts: [], errors: []}
TEST (:!make test equivalent)¶
- Runtime behavior validation
- Isolated execution environment
- Tools: pytest, jest, go test, cargo test
- Returns:
{passed: int, failed: int, summary: str}
Auto-Detection¶
# test.detect() returns:
{
"test_runner": {"name": "pytest", "cmd": ["pytest", "-v"]},
"build_tool": {"name": "pip", "cmd": ["pip", "install", "-e", "."]},
"check_tool": {"name": "ruff", "cmd": ["ruff", "check", "."]}
}
Implementation¶
BaseTool Pattern¶
from hanzo_tools.core import BaseTool, ActionHandler, ToolError
class FsTool(BaseTool):
name: ClassVar[str] = "fs"
def __init__(self, cwd: str | None = None):
super().__init__()
self.cwd = cwd or os.getcwd()
self._register_actions()
def _register_actions(self):
@self.action("read", "Read file content")
async def read(ctx, path: str, encoding: str = "utf-8"):
"""Path → Content"""
# Implementation
return {"text": content, "hash": content_hash(content)}
@self.action("search", "Search for pattern")
async def search(ctx, pattern: str, path: str = ".", **opts):
"""(Pattern, Scope) → [Match]"""
# Implementation
return {"matches": matches, "count": len(matches)}
Unified Response Envelope¶
All tools return a consistent envelope:
{
"ok": True, # Success indicator
"data": {...}, # Action-specific result
"error": None, # Or {code, message, details}
"meta": { # Optional metadata
"duration_ms": 42,
"effect": "deterministic"
}
}
Entry Points¶
Each package exports via entry points:
# pyproject.toml
[project.entry-points."hanzo.tools"]
filesystem = "hanzo_tools.filesystem:TOOLS"
code = "hanzo_tools.code:TOOLS"
plan = "hanzo_tools.plan:TOOLS"
test = "hanzo_tools.test:TOOLS"
net = "hanzo_tools.net:TOOLS"
Package Structure¶
pkg/
├── hanzo-tools-core/ # BaseTool, IdTool, ToolRegistry
│ └── hanzo_tools/core/
├── hanzo-tools-filesystem/ # FsTool (read, write, edit, search, etc.)
│ └── hanzo_tools/filesystem/
├── hanzo-tools-code/ # CodeTool (parse, transform, summarize)
│ └── hanzo_tools/code/
├── hanzo-tools-shell/ # ProcTool (run, bg, signal, wait)
│ └── hanzo_tools/shell/
├── hanzo-tools-vcs/ # VcsTool (status, diff, commit, log)
│ └── hanzo_tools/vcs/
├── hanzo-tools-test/ # TestTool (check, build, test)
│ └── hanzo_tools/test/
├── hanzo-tools-net/ # NetTool (search, fetch, download, crawl)
│ └── hanzo_tools/net/
├── hanzo-tools-plan/ # PlanTool (intent, route, compose)
│ └── hanzo_tools/plan/
├── hanzo-tools-browser/ # BrowserTool (Playwright control)
│ └── hanzo_tools/browser/
├── hanzo-tools-computer/ # ComputerTool (OS desktop control)
│ └── hanzo_tools/computer/
└── hanzo-mcp/ # MCP server (discovers tools via entry points)
└── hanzo_mcp/
Migration Path¶
Phase 1: Implement Core Operators¶
- [x]
fs- Filesystem operations (existing hanzo-tools-filesystem) - [x]
id- Identity operations (hash, uri, ref, verify) - [x]
code- Code operations (parse, transform, summarize) - [x]
proc- Process execution (existing hanzo-tools-shell) - [x]
vcs- Version control (existing hanzo-tools-vcs) - [x]
test- Validation loops (check, build, test) - [x]
net- Network operations (search, fetch, download, crawl) - [x]
plan- Orchestration (intent, route, compose)
Phase 2: Control Surfaces¶
- [x]
browser- Web DOM control (existing hanzo-tools-browser) - [ ]
computer- OS desktop control (needs update)
Phase 3: Extended Operators¶
- [ ]
lsp- Language server integration - [ ]
dbg- Debugger control (breakpoint, step, eval) - [ ]
repl- Interactive evaluation (send, recv, reset)
Phase 4: Deprecation¶
- Deprecate individual tools (read_file, search_code, etc.)
- Map old names to new unified tools
- Remove after 6 months
Composition Examples¶
Safe Refactoring¶
# 1. Find all references (PURE)
refs = await code.call(ctx, action="references", path="src/auth.py", position={"line": 42, "col": 10})
# 2. Generate patch (PURE)
patch = await code.call(ctx, action="transform", path="src/auth.py", kind="rename", old_name="authenticate", new_name="verify_user")
# 3. Summarize changes (PURE)
summary = await code.call(ctx, action="summarize", diff=patch["patch"])
# 4. [POLICY GATE] - User approves
# 5. Apply patch (EFFECT)
await fs.call(ctx, action="patch", path="src/auth.py", patch=patch["patch"], base_hash=patch["base_hash"])
# 6. Verify (EFFECT)
await test.call(ctx, action="run")
Site Mirroring¶
# Crawl site recursively
result = await net.call(ctx, action="crawl", url="https://docs.example.com", dest="./mirror", depth=3, limit=100)
# Returns: {pages: ["/mirror/index.html", ...], count: 47}
Intelligent Search¶
# Parse intent
intent = await plan.call(ctx, action="intent", nl="find where user authentication happens")
# Route to canonical chain
plan = await plan.call(ctx, action="route", intent_ir=intent)
# Execute chain
for step in plan["nodes"]:
if step.get("policy_gate"):
# Request approval
pass
result = await dispatch(step["tool"], step["action"], step["params"])
Security Considerations¶
- Path Traversal: All path operations validate against allowed directories
- Command Injection:
proc.runsanitizes inputs, requires explicit shell=True - Network Access:
net.*operations respect proxy settings and rate limits - File Modifications: Require
base_hashprecondition to prevent race conditions - Policy Gates: High-risk operations require explicit approval
References¶
Changelog¶
- 2025-01-22: Initial draft with operator lattice specification
- 2025-01-21: Created CodeTool, PlanTool, TestTool, NetTool, IdTool packages