Skip to content

HIP-0300: Unified MCP Tools Architecture

Status: Draft Author: Hanzo AI Created: January 2025 Updated: January 2025

Abstract

HIP-0300 defines a unified architecture for Model Context Protocol (MCP) tools, consolidating 52+ individual tools into ~16 orthogonal, composable operators. The design follows Unix philosophy: each tool does one thing well, with clear composition laws.

Motivation

The original hanzo-mcp implementation grew organically to 52+ tools with significant overlap: - Multiple search tools (grep, search, find, ast_search) - Multiple file tools (read, write, edit, cat, head, tail) - Multiple shell tools (bash, zsh, sh, shell, cmd)

This creates confusion for LLM agents and increases cognitive load. HIP-0300 restructures tools around orthogonal axes with a formal effect lattice.

Design Philosophy

Core Principles

  1. Orthogonal Axes: Tools organized along independent dimensions
  2. Composability: Small operators that combine predictably
  3. Effect Tracking: Every operation has a declared effect level
  4. Transform vs Apply: Pure transforms produce Patches; Apply commits them
  5. Minimal Surface Area: 16 core operators cover all use cases

The Three Lattices

Three lattices constrain all operations:

EFFECT LATTICE (purity)
├─ PURE                    (no side effects, referentially transparent)
├─ DETERMINISTIC_EFFECT    (effects, but reproducible)
└─ NONDETERMINISTIC_EFFECT (network, time, randomness)

REPRESENTATION LATTICE (data granularity)
├─ Bytes   (raw content)
├─ Lines   (text with line numbers)
├─ AST     (parsed structure)
├─ Patch   (diff/change set)
└─ Symbols (semantic references)

SCOPE LATTICE (operational boundary)
├─ Span    (byte/char range)
├─ Region  (line range)
├─ File    (single file)
├─ Tree    (directory subtree)
└─ Repo    (entire repository)

Operator Surface

Core Operators (HIP-0300)

Tool Axis Actions Effect
fs Bytes + Paths read, write, edit, search, list, stat, patch, tree, glob DETERMINISTIC
id Identity hash, uri, ref, verify PURE
code Symbols + Structure parse, serialize, symbols, definition, references, transform, summarize PURE/DETERMINISTIC
proc Execution run, bg, signal, status, wait NONDETERMINISTIC
vcs History + Diffs status, diff, commit, log, branch, stash DETERMINISTIC
test Validation check, build, test, detect NONDETERMINISTIC
net Network search, fetch, download, crawl, head NONDETERMINISTIC
plan Orchestration intent, route, compose PURE

Control Surfaces

Tool Surface Actions Effect
browser Web DOM navigate, click, type, screenshot, evaluate, etc. NONDETERMINISTIC
computer OS Desktop screenshot, click, type, key, scroll, etc. NONDETERMINISTIC

Extended Operators

Tool Domain Actions
lsp Semantic Stream diagnostics, code_actions, hover, completion
memory Knowledge Persistence read, write, search, create, recall
todo Task Tracking list, add, update, remove
reasoning Cognition think, critic
agent Multi-Agent run, list, status, config
llm LLM Interface chat, consensus

Verb Kernel

The 27-verb kernel with typed signatures:

File Operations (fs)

read    : Path → Bytes | Text | Lines
write   : (Path, Content) → {ok, hash}
edit    : (Path, Patch) → {ok, hash}
search  : (Pattern, Scope) → [Match]
list    : Path → [Entry]
stat    : Path → {size, mtime, mode, hash}
patch   : (Path, Patch, base_hash?) → {ok, new_hash}
tree    : (Path, depth?) → TreeNode
glob    : (Pattern, Path?) → [Path]

Identity Operations (id)

hash    : Content → {digest, algo, size}
uri     : Path → {uri, path, exists}
ref     : (Path, line?, col?) → {uri, range?, hash?}
verify  : (Content, Digest) → {match, actual, expected}

Code Operations (code)

parse     : (Path | Text, lang?) → AST
serialize : AST → Text
symbols   : (Path | AST, kind?) → [Symbol]
definition : (Path, Position) → [Location]
references : (Path, Position) → [Location]
transform  : (Path | Text, kind, params) → Patch  # PURE!
summarize  : (Diff | Log | Report) → {summary, risks, next_actions}

Process Operations (proc)

run     : Command → {stdout, stderr, code}
bg      : Command → ProcessID
signal  : (ProcessID, Signal) → {ok}
status  : ProcessID? → [ProcessStatus]
wait    : (ProcessID, timeout?) → {stdout, stderr, code}

Version Control (vcs)

status  : () → {staged, unstaged, untracked}
diff    : (ref1?, ref2?, paths?) → Diff
commit  : (message, files?) → {sha, message}
log     : (n?, since?, until?, path?) → [Commit]
branch  : (name?, action?) → {current, branches}
stash   : (action, message?) → {ok}

Validation (test)

check   : (Path?, tool?) → {diagnostics, pass}  # Lint/typecheck
build   : (Path?, tool?) → {success, artifacts}  # Compilation
test    : (selector?, tool?) → {passed, failed, summary}  # Runtime
detect  : (Path?) → {test_runner, build_tool, check_tool}

Network (net)

search   : (Query, engine?) → [{url, title, snippet}]
fetch    : (URL, extract_text?) → {text, mime, status, hash}
download : (URL, dest?, assets?) → {path, size, mime}
crawl    : (URL, dest, depth?, limit?) → {pages, count}
head     : URL → {status, headers, size?, mime?}

Orchestration (plan)

intent  : NL → IntentIR
route   : (IntentIR, Policy?) → Plan
compose : Plan → ExecGraph

Effect Annotations

Every action declares its effect level:

class Effect(Enum):
    PURE = "pure"                    # No side effects
    DETERMINISTIC = "deterministic"  # Effects, reproducible
    NONDETERMINISTIC = "nondeterministic"  # Network/time/random

# Examples:
# fs.read      → DETERMINISTIC_EFFECT (reads disk)
# code.parse   → PURE (in-memory transform)
# code.transform → PURE (produces Patch value)
# fs.patch     → DETERMINISTIC_EFFECT (writes disk)
# net.fetch    → NONDETERMINISTIC_EFFECT (network I/O)

Transform vs Apply

Critical separation of concerns:

Transform (PURE)              Apply (EFFECT)
─────────────────            ────────────────
code.transform               fs.patch
  → Patch value                → committed change
  → base_hash                  → requires base_hash precondition
  → preview-safe               → point of no return

vcs.diff                     vcs.commit
  → Diff value                 → committed history

Content-Addressable Edits

All mutations use content-addressable storage for safety:

# Generate a transform
result = await code.call(ctx, action="transform",
                        path="/src/main.py",
                        kind="rename",
                        old_name="foo",
                        new_name="bar")
# Returns: {patch: [...], base_hash: "sha256:abc...", new_hash: "sha256:def..."}

# Apply with precondition
result = await fs.call(ctx, action="patch",
                      path="/src/main.py",
                      patch=patch,
                      base_hash="sha256:abc...")  # Must match current
# Fails if file changed since transform was computed

Intent Routing

Natural language maps to canonical operator chains:

Intent IR Structure

class IntentIR:
    category: str   # navigate, explain, modify, validate, debug, create
    action: str     # find, understand, rename, test, trace, add
    target: str     # extracted entity/path/concept
    confidence: float

Canonical Chains

Intent Pattern Canonical Chain
"find X" fs.search(X)
"what is X" fs.search(X) → code.summarize
"rename X to Y" code.references(X) → code.transform(rename) → code.summarize → [policy] → fs.patch → test.run
"fix bug in X" fs.read(X) → code.parse → code.transform(fix) → [policy] → fs.patch → test.run
"add tests for X" code.symbols(X) → code.transform(add_tests) → fs.write → test.run
"why does X fail" test.run(X) → vcs.log → code.summarize
"refactor X" code.references(X) → code.transform → code.summarize → [policy] → fs.patch → test.run

Policy Gates

High-risk operations require explicit approval: - fs.patch (file modifications) - fs.write (new file creation) - proc.run with shell=True - vcs.commit / vcs.push

Validation Loops

Three distinct validation operations (Vim-inspired):

CHECK (:make equivalent)

  • Fast, incremental feedback
  • Lint, typecheck, format check
  • Tools: ruff, mypy, eslint, tsc, clippy
  • Returns: {diagnostics: [], pass: bool}

BUILD (:!make equivalent)

  • Whole-project compilation
  • Dependency resolution
  • Tools: pip, npm, cargo, go build, make
  • Returns: {success: bool, artifacts: [], errors: []}

TEST (:!make test equivalent)

  • Runtime behavior validation
  • Isolated execution environment
  • Tools: pytest, jest, go test, cargo test
  • Returns: {passed: int, failed: int, summary: str}

Auto-Detection

# test.detect() returns:
{
    "test_runner": {"name": "pytest", "cmd": ["pytest", "-v"]},
    "build_tool": {"name": "pip", "cmd": ["pip", "install", "-e", "."]},
    "check_tool": {"name": "ruff", "cmd": ["ruff", "check", "."]}
}

Implementation

BaseTool Pattern

from hanzo_tools.core import BaseTool, ActionHandler, ToolError

class FsTool(BaseTool):
    name: ClassVar[str] = "fs"

    def __init__(self, cwd: str | None = None):
        super().__init__()
        self.cwd = cwd or os.getcwd()
        self._register_actions()

    def _register_actions(self):
        @self.action("read", "Read file content")
        async def read(ctx, path: str, encoding: str = "utf-8"):
            """Path → Content"""
            # Implementation
            return {"text": content, "hash": content_hash(content)}

        @self.action("search", "Search for pattern")
        async def search(ctx, pattern: str, path: str = ".", **opts):
            """(Pattern, Scope) → [Match]"""
            # Implementation
            return {"matches": matches, "count": len(matches)}

Unified Response Envelope

All tools return a consistent envelope:

{
    "ok": True,           # Success indicator
    "data": {...},        # Action-specific result
    "error": None,        # Or {code, message, details}
    "meta": {             # Optional metadata
        "duration_ms": 42,
        "effect": "deterministic"
    }
}

Entry Points

Each package exports via entry points:

# pyproject.toml
[project.entry-points."hanzo.tools"]
filesystem = "hanzo_tools.filesystem:TOOLS"
code = "hanzo_tools.code:TOOLS"
plan = "hanzo_tools.plan:TOOLS"
test = "hanzo_tools.test:TOOLS"
net = "hanzo_tools.net:TOOLS"

Package Structure

pkg/
├── hanzo-tools-core/          # BaseTool, IdTool, ToolRegistry
│   └── hanzo_tools/core/
├── hanzo-tools-filesystem/    # FsTool (read, write, edit, search, etc.)
│   └── hanzo_tools/filesystem/
├── hanzo-tools-code/          # CodeTool (parse, transform, summarize)
│   └── hanzo_tools/code/
├── hanzo-tools-shell/         # ProcTool (run, bg, signal, wait)
│   └── hanzo_tools/shell/
├── hanzo-tools-vcs/           # VcsTool (status, diff, commit, log)
│   └── hanzo_tools/vcs/
├── hanzo-tools-test/          # TestTool (check, build, test)
│   └── hanzo_tools/test/
├── hanzo-tools-net/           # NetTool (search, fetch, download, crawl)
│   └── hanzo_tools/net/
├── hanzo-tools-plan/          # PlanTool (intent, route, compose)
│   └── hanzo_tools/plan/
├── hanzo-tools-browser/       # BrowserTool (Playwright control)
│   └── hanzo_tools/browser/
├── hanzo-tools-computer/      # ComputerTool (OS desktop control)
│   └── hanzo_tools/computer/
└── hanzo-mcp/                 # MCP server (discovers tools via entry points)
    └── hanzo_mcp/

Migration Path

Phase 1: Implement Core Operators

  • [x] fs - Filesystem operations (existing hanzo-tools-filesystem)
  • [x] id - Identity operations (hash, uri, ref, verify)
  • [x] code - Code operations (parse, transform, summarize)
  • [x] proc - Process execution (existing hanzo-tools-shell)
  • [x] vcs - Version control (existing hanzo-tools-vcs)
  • [x] test - Validation loops (check, build, test)
  • [x] net - Network operations (search, fetch, download, crawl)
  • [x] plan - Orchestration (intent, route, compose)

Phase 2: Control Surfaces

  • [x] browser - Web DOM control (existing hanzo-tools-browser)
  • [ ] computer - OS desktop control (needs update)

Phase 3: Extended Operators

  • [ ] lsp - Language server integration
  • [ ] dbg - Debugger control (breakpoint, step, eval)
  • [ ] repl - Interactive evaluation (send, recv, reset)

Phase 4: Deprecation

  • Deprecate individual tools (read_file, search_code, etc.)
  • Map old names to new unified tools
  • Remove after 6 months

Composition Examples

Safe Refactoring

# 1. Find all references (PURE)
refs = await code.call(ctx, action="references", path="src/auth.py", position={"line": 42, "col": 10})

# 2. Generate patch (PURE)
patch = await code.call(ctx, action="transform", path="src/auth.py", kind="rename", old_name="authenticate", new_name="verify_user")

# 3. Summarize changes (PURE)
summary = await code.call(ctx, action="summarize", diff=patch["patch"])

# 4. [POLICY GATE] - User approves

# 5. Apply patch (EFFECT)
await fs.call(ctx, action="patch", path="src/auth.py", patch=patch["patch"], base_hash=patch["base_hash"])

# 6. Verify (EFFECT)
await test.call(ctx, action="run")

Site Mirroring

# Crawl site recursively
result = await net.call(ctx, action="crawl", url="https://docs.example.com", dest="./mirror", depth=3, limit=100)

# Returns: {pages: ["/mirror/index.html", ...], count: 47}
# Parse intent
intent = await plan.call(ctx, action="intent", nl="find where user authentication happens")

# Route to canonical chain
plan = await plan.call(ctx, action="route", intent_ir=intent)

# Execute chain
for step in plan["nodes"]:
    if step.get("policy_gate"):
        # Request approval
        pass
    result = await dispatch(step["tool"], step["action"], step["params"])

Security Considerations

  1. Path Traversal: All path operations validate against allowed directories
  2. Command Injection: proc.run sanitizes inputs, requires explicit shell=True
  3. Network Access: net.* operations respect proxy settings and rate limits
  4. File Modifications: Require base_hash precondition to prevent race conditions
  5. Policy Gates: High-risk operations require explicit approval

References

Changelog

  • 2025-01-22: Initial draft with operator lattice specification
  • 2025-01-21: Created CodeTool, PlanTool, TestTool, NetTool, IdTool packages