HIP-0300: Unified MCP Tools Architecture¶

Status: Draft Author: Hanzo AI Created: January 2025 Updated: January 2025

Abstract¶

HIP-0300 defines a unified architecture for Model Context Protocol (MCP) tools, consolidating 52+ individual tools into ~16 orthogonal, composable operators. The design follows Unix philosophy: each tool does one thing well, with clear composition laws.

Motivation¶

The original hanzo-mcp implementation grew organically to 52+ tools with significant overlap: - Multiple search tools (grep, search, find, ast_search) - Multiple file tools (read, write, edit, cat, head, tail) - Multiple shell tools (bash, zsh, sh, shell, cmd)

This creates confusion for LLM agents and increases cognitive load. HIP-0300 restructures tools around orthogonal axes with a formal effect lattice.

Design Philosophy¶

Core Principles¶

Orthogonal Axes: Tools organized along independent dimensions
Composability: Small operators that combine predictably
Effect Tracking: Every operation has a declared effect level
Transform vs Apply: Pure transforms produce Patches; Apply commits them
Minimal Surface Area: 16 core operators cover all use cases

The Three Lattices¶

Three lattices constrain all operations:

EFFECT LATTICE (purity)
├─ PURE                    (no side effects, referentially transparent)
├─ DETERMINISTIC_EFFECT    (effects, but reproducible)
└─ NONDETERMINISTIC_EFFECT (network, time, randomness)

REPRESENTATION LATTICE (data granularity)
├─ Bytes   (raw content)
├─ Lines   (text with line numbers)
├─ AST     (parsed structure)
├─ Patch   (diff/change set)
└─ Symbols (semantic references)

SCOPE LATTICE (operational boundary)
├─ Span    (byte/char range)
├─ Region  (line range)
├─ File    (single file)
├─ Tree    (directory subtree)
└─ Repo    (entire repository)

Operator Surface¶

Core Operators (HIP-0300)¶

Tool	Axis	Actions	Effect
`fs`	Bytes + Paths	read, write, edit, search, list, stat, patch, tree, glob	DETERMINISTIC
`id`	Identity	hash, uri, ref, verify	PURE
`code`	Symbols + Structure	parse, serialize, symbols, definition, references, transform, summarize	PURE/DETERMINISTIC
`proc`	Execution	run, bg, signal, status, wait	NONDETERMINISTIC
`vcs`	History + Diffs	status, diff, commit, log, branch, stash	DETERMINISTIC
`test`	Validation	check, build, test, detect	NONDETERMINISTIC
`net`	Network	search, fetch, download, crawl, head	NONDETERMINISTIC
`plan`	Orchestration	intent, route, compose	PURE

Control Surfaces¶

Tool	Surface	Actions	Effect
`browser`	Web DOM	navigate, click, type, screenshot, evaluate, etc.	NONDETERMINISTIC
`computer`	OS Desktop	screenshot, click, type, key, scroll, etc.	NONDETERMINISTIC

Extended Operators¶

Tool	Domain	Actions
`lsp`	Semantic Stream	diagnostics, code_actions, hover, completion
`memory`	Knowledge Persistence	read, write, search, create, recall
`todo`	Task Tracking	list, add, update, remove
`reasoning`	Cognition	think, critic
`agent`	Multi-Agent	run, list, status, config
`llm`	LLM Interface	chat, consensus

Verb Kernel¶

The 27-verb kernel with typed signatures:

File Operations (fs)¶

read    : Path → Bytes | Text | Lines
write   : (Path, Content) → {ok, hash}
edit    : (Path, Patch) → {ok, hash}
search  : (Pattern, Scope) → [Match]
list    : Path → [Entry]
stat    : Path → {size, mtime, mode, hash}
patch   : (Path, Patch, base_hash?) → {ok, new_hash}
tree    : (Path, depth?) → TreeNode
glob    : (Pattern, Path?) → [Path]

Identity Operations (id)¶

hash    : Content → {digest, algo, size}
uri     : Path → {uri, path, exists}
ref     : (Path, line?, col?) → {uri, range?, hash?}
verify  : (Content, Digest) → {match, actual, expected}

Code Operations (code)¶

parse     : (Path | Text, lang?) → AST
serialize : AST → Text
symbols   : (Path | AST, kind?) → [Symbol]
definition : (Path, Position) → [Location]
references : (Path, Position) → [Location]
transform  : (Path | Text, kind, params) → Patch  # PURE!
summarize  : (Diff | Log | Report) → {summary, risks, next_actions}

Process Operations (proc)¶

run     : Command → {stdout, stderr, code}
bg      : Command → ProcessID
signal  : (ProcessID, Signal) → {ok}
status  : ProcessID? → [ProcessStatus]
wait    : (ProcessID, timeout?) → {stdout, stderr, code}

Version Control (vcs)¶

status  : () → {staged, unstaged, untracked}
diff    : (ref1?, ref2?, paths?) → Diff
commit  : (message, files?) → {sha, message}
log     : (n?, since?, until?, path?) → [Commit]
branch  : (name?, action?) → {current, branches}
stash   : (action, message?) → {ok}

Validation (test)¶

check   : (Path?, tool?) → {diagnostics, pass}  # Lint/typecheck
build   : (Path?, tool?) → {success, artifacts}  # Compilation
test    : (selector?, tool?) → {passed, failed, summary}  # Runtime
detect  : (Path?) → {test_runner, build_tool, check_tool}

Network (net)¶

search   : (Query, engine?) → [{url, title, snippet}]
fetch    : (URL, extract_text?) → {text, mime, status, hash}
download : (URL, dest?, assets?) → {path, size, mime}
crawl    : (URL, dest, depth?, limit?) → {pages, count}
head     : URL → {status, headers, size?, mime?}

Orchestration (plan)¶

intent  : NL → IntentIR
route   : (IntentIR, Policy?) → Plan
compose : Plan → ExecGraph

Effect Annotations¶

Every action declares its effect level:

class Effect(Enum):
    PURE = "pure"                    # No side effects
    DETERMINISTIC = "deterministic"  # Effects, reproducible
    NONDETERMINISTIC = "nondeterministic"  # Network/time/random

# Examples:
# fs.read      → DETERMINISTIC_EFFECT (reads disk)
# code.parse   → PURE (in-memory transform)
# code.transform → PURE (produces Patch value)
# fs.patch     → DETERMINISTIC_EFFECT (writes disk)
# net.fetch    → NONDETERMINISTIC_EFFECT (network I/O)

Transform vs Apply¶

Critical separation of concerns:

Transform (PURE)              Apply (EFFECT)
─────────────────            ────────────────
code.transform               fs.patch
  → Patch value                → committed change
  → base_hash                  → requires base_hash precondition
  → preview-safe               → point of no return

vcs.diff                     vcs.commit
  → Diff value                 → committed history

Content-Addressable Edits¶

All mutations use content-addressable storage for safety:

# Generate a transform
result = await code.call(ctx, action="transform",
                        path="/src/main.py",
                        kind="rename",
                        old_name="foo",
                        new_name="bar")
# Returns: {patch: [...], base_hash: "sha256:abc...", new_hash: "sha256:def..."}

# Apply with precondition
result = await fs.call(ctx, action="patch",
                      path="/src/main.py",
                      patch=patch,
                      base_hash="sha256:abc...")  # Must match current
# Fails if file changed since transform was computed

Intent Routing¶

Natural language maps to canonical operator chains:

Intent IR Structure¶

class IntentIR:
    category: str   # navigate, explain, modify, validate, debug, create
    action: str     # find, understand, rename, test, trace, add
    target: str     # extracted entity/path/concept
    confidence: float

Canonical Chains¶

Intent Pattern	Canonical Chain
"find X"	`fs.search(X)`
"what is X"	`fs.search(X) → code.summarize`
"rename X to Y"	`code.references(X) → code.transform(rename) → code.summarize → [policy] → fs.patch → test.run`
"fix bug in X"	`fs.read(X) → code.parse → code.transform(fix) → [policy] → fs.patch → test.run`
"add tests for X"	`code.symbols(X) → code.transform(add_tests) → fs.write → test.run`
"why does X fail"	`test.run(X) → vcs.log → code.summarize`
"refactor X"	`code.references(X) → code.transform → code.summarize → [policy] → fs.patch → test.run`

Policy Gates¶

High-risk operations require explicit approval: - fs.patch (file modifications) - fs.write (new file creation) - proc.run with shell=True - vcs.commit / vcs.push

Validation Loops¶

Three distinct validation operations (Vim-inspired):

CHECK (`:make` equivalent)¶

Fast, incremental feedback
Lint, typecheck, format check
Tools: ruff, mypy, eslint, tsc, clippy
Returns: {diagnostics: [], pass: bool}

BUILD (`:!make` equivalent)¶

Whole-project compilation
Dependency resolution
Tools: pip, npm, cargo, go build, make
Returns: {success: bool, artifacts: [], errors: []}

TEST (`:!make test` equivalent)¶

Runtime behavior validation
Isolated execution environment
Tools: pytest, jest, go test, cargo test
Returns: {passed: int, failed: int, summary: str}

Auto-Detection¶

# test.detect() returns:
{
    "test_runner": {"name": "pytest", "cmd": ["pytest", "-v"]},
    "build_tool": {"name": "pip", "cmd": ["pip", "install", "-e", "."]},
    "check_tool": {"name": "ruff", "cmd": ["ruff", "check", "."]}
}

Implementation¶

BaseTool Pattern¶

from hanzo_tools.core import BaseTool, ActionHandler, ToolError

class FsTool(BaseTool):
    name: ClassVar[str] = "fs"

    def __init__(self, cwd: str | None = None):
        super().__init__()
        self.cwd = cwd or os.getcwd()
        self._register_actions()

    def _register_actions(self):
        @self.action("read", "Read file content")
        async def read(ctx, path: str, encoding: str = "utf-8"):
            """Path → Content"""
            # Implementation
            return {"text": content, "hash": content_hash(content)}

        @self.action("search", "Search for pattern")
        async def search(ctx, pattern: str, path: str = ".", **opts):
            """(Pattern, Scope) → [Match]"""
            # Implementation
            return {"matches": matches, "count": len(matches)}

Unified Response Envelope¶

All tools return a consistent envelope:

{
    "ok": True,           # Success indicator
    "data": {...},        # Action-specific result
    "error": None,        # Or {code, message, details}
    "meta": {             # Optional metadata
        "duration_ms": 42,
        "effect": "deterministic"
    }
}

Entry Points¶

Each package exports via entry points:

# pyproject.toml
[project.entry-points."hanzo.tools"]
filesystem = "hanzo_tools.filesystem:TOOLS"
code = "hanzo_tools.code:TOOLS"
plan = "hanzo_tools.plan:TOOLS"
test = "hanzo_tools.test:TOOLS"
net = "hanzo_tools.net:TOOLS"

Package Structure¶

pkg/
├── hanzo-tools-core/          # BaseTool, IdTool, ToolRegistry
│   └── hanzo_tools/core/
├── hanzo-tools-filesystem/    # FsTool (read, write, edit, search, etc.)
│   └── hanzo_tools/filesystem/
├── hanzo-tools-code/          # CodeTool (parse, transform, summarize)
│   └── hanzo_tools/code/
├── hanzo-tools-shell/         # ProcTool (run, bg, signal, wait)
│   └── hanzo_tools/shell/
├── hanzo-tools-vcs/           # VcsTool (status, diff, commit, log)
│   └── hanzo_tools/vcs/
├── hanzo-tools-test/          # TestTool (check, build, test)
│   └── hanzo_tools/test/
├── hanzo-tools-net/           # NetTool (search, fetch, download, crawl)
│   └── hanzo_tools/net/
├── hanzo-tools-plan/          # PlanTool (intent, route, compose)
│   └── hanzo_tools/plan/
├── hanzo-tools-browser/       # BrowserTool (Playwright control)
│   └── hanzo_tools/browser/
├── hanzo-tools-computer/      # ComputerTool (OS desktop control)
│   └── hanzo_tools/computer/
└── hanzo-mcp/                 # MCP server (discovers tools via entry points)
    └── hanzo_mcp/

Migration Path¶

Phase 1: Implement Core Operators¶

[x] fs - Filesystem operations (existing hanzo-tools-filesystem)
[x] id - Identity operations (hash, uri, ref, verify)
[x] code - Code operations (parse, transform, summarize)
[x] proc - Process execution (existing hanzo-tools-shell)
[x] vcs - Version control (existing hanzo-tools-vcs)
[x] test - Validation loops (check, build, test)
[x] net - Network operations (search, fetch, download, crawl)
[x] plan - Orchestration (intent, route, compose)

Phase 2: Control Surfaces¶

[x] browser - Web DOM control (existing hanzo-tools-browser)
[ ] computer - OS desktop control (needs update)

Phase 3: Extended Operators¶

[ ] lsp - Language server integration
[ ] dbg - Debugger control (breakpoint, step, eval)
[ ] repl - Interactive evaluation (send, recv, reset)

Phase 4: Deprecation¶

Deprecate individual tools (read_file, search_code, etc.)
Map old names to new unified tools
Remove after 6 months

Composition Examples¶

Safe Refactoring¶

# 1. Find all references (PURE)
refs = await code.call(ctx, action="references", path="src/auth.py", position={"line": 42, "col": 10})

# 2. Generate patch (PURE)
patch = await code.call(ctx, action="transform", path="src/auth.py", kind="rename", old_name="authenticate", new_name="verify_user")

# 3. Summarize changes (PURE)
summary = await code.call(ctx, action="summarize", diff=patch["patch"])

# 4. [POLICY GATE] - User approves

# 5. Apply patch (EFFECT)
await fs.call(ctx, action="patch", path="src/auth.py", patch=patch["patch"], base_hash=patch["base_hash"])

# 6. Verify (EFFECT)
await test.call(ctx, action="run")

Site Mirroring¶

# Crawl site recursively
result = await net.call(ctx, action="crawl", url="https://docs.example.com", dest="./mirror", depth=3, limit=100)

# Returns: {pages: ["/mirror/index.html", ...], count: 47}

Intelligent Search¶

# Parse intent
intent = await plan.call(ctx, action="intent", nl="find where user authentication happens")

# Route to canonical chain
plan = await plan.call(ctx, action="route", intent_ir=intent)

# Execute chain
for step in plan["nodes"]:
    if step.get("policy_gate"):
        # Request approval
        pass
    result = await dispatch(step["tool"], step["action"], step["params"])

Security Considerations¶

Path Traversal: All path operations validate against allowed directories
Command Injection: proc.run sanitizes inputs, requires explicit shell=True
Network Access: net.* operations respect proxy settings and rate limits
File Modifications: Require base_hash precondition to prevent race conditions
Policy Gates: High-risk operations require explicit approval

References¶

Changelog¶

2025-01-22: Initial draft with operator lattice specification
2025-01-21: Created CodeTool, PlanTool, TestTool, NetTool, IdTool packages