claude-tool-audit

Tools reviewed against the four gates — Obs · Cost · Simp · Corr · letter scale A+..F

Plugin repo ▸ github.com/mbrian23/claude-tool-audit  •  Talk ▸ meetup-27-may  •  Built with scripts/build-site.py

Scale A+ exemplary A good B/C acceptable / mediocre D poor F failure ▸ drop Rule ▸ one failing gate = drop. No averaging.
First-party primitives
Tool Obs Cost Simp Corr Verdict Use case & source
Permissions ▸ allow / ask / denyFirst-party primitive (config layer, before any plugin)A+A+A+A+Buyfor: Every project.The lightest fix in the decision framework — promoting a repeated "yes" to `allow`, demoting a regret to `deny`, leaving everything else at `ask`.▸ install ▸ N/A — built into Claude Code; configured in .claude/settings.jsonhttps://code.claude.com/docs/en/permissions
CLAUDE.md ▸ tightFirst-party primitive (always-on context file)A+AA+ABuyfor: Any project that has more than one contributor.Project conventions and always-on facts — under 200 lines, only conventions and links to longer docs, no imperative rules that need enforcement.▸ install ▸ N/A — a markdown file you author; no install stephttps://code.claude.com/docs/en/memory
CLAUDE.md ▸ bloatedFirst-party primitive (always-on context file)CDCDfor: Any project that has more than one contributor.Trying to use `CLAUDE.md` for *enforcement* — listing imperative rules ("never run X", "always do Y") rather than just context.▸ install ▸ N/A — a markdown file you author; no install stepSame primitive as `claude-md-tight.md`. Same docs, same file.
Hooks ▸ surgical (lint / secret-scan)First-party primitive (PreToolUse / PostToolUse / Stop)A+A+AA+Buyfor: Any project that has a rule worth enforcing deterministically.Surgical hook — one hook, one concern, fast. E.g. lint on file write, secret-scan on bash, boundary-check on edit.▸ install ▸ N/A — a hooks.json and shell scripts in your own repohttps://code.claude.com/docs/en/hooks
Hooks ▸ chained external (the wrong use)First-party primitive (PreToolUse / PostToolUse / Stop)CDDCfor: Any project that's reached for hooks for the wrong job."Smart" hook stack — 5 chained, external API calls, LLM in the decision path. Over-engineered, stalls every action.▸ install ▸ N/A — hooks.json + shell scripts; supply chain risk lives in *what your scripts call out to*Same primitive as `hooks-lint.md`. Same docs.
Skills ▸ first-party primitiveFirst-party primitiveA+AAABuyfor: Project with reusable domain knowledge that should load on demand.Reusable playbook / domain knowledge — when none of Permissions, Hooks, MCP, or Sub-agents fits the need.▸ install ▸ N/A — markdown files; nothing executes at installhttps://code.claude.com/docs/en
Sub-agents ▸ verbose-task isolationFirst-party primitiveAABA+Buyfor: Project with verbose, parallelizable work that would otherwise pollute the main context.Isolating verbose research / multi-file refactoring / exploration into a sub-agent so the main context stays clean — *not* the PR-review use case (see `pr-review-toolkit.md` for that scoring).▸ install ▸ N/A — markdown files defining the sub-agenthttps://code.claude.com/docs/en
Auto memoryFirst-party primitiveCCACWrapfor: Long-running projects with cross-session context.Persistent cross-conversation context — user profile, feedback, project history.▸ install ▸ N/A — first-party, files written to ~/.claude/projects/.../memory/Claude Code persistent file-based memory at `~/.claude/projects/.../memory/`
Slash commands
Tool Obs Cost Simp Corr Verdict Use case & source
/clearFirst-party slash commandA+A+A+A+Buyfor: Every Claude Code session.Resetting context when the conversation drifts off-task or when context budget is about to bite — cheaper than scrolling, cheaper than restarting.▸ install ▸ N/A — first-party CLI command, nothing to installhttps://code.claude.com/docs/en
/memoryFirst-party slash commandA+A+A+ABuyfor: Long-running projects with persistent memory.Weekly audit of accumulated memories — trim stale ones, verify project context is still accurate, remove obsolete user preferences.▸ install ▸ N/A — first-party CLI command, nothing to installhttps://code.claude.com/docs/en
/loop ▸ recurring promptFirst-party slash commandA+CAABuyfor: Project where polling or recurring work is genuinely useful.Polling a CI run that takes ~8 minutes — short loop, clear stop condition (the run completes), bounded total cost.▸ install ▸ N/A — first-party CLI command, nothing to installhttps://code.claude.com/docs/en
Models
Tool Obs Cost Simp Corr Verdict Use case & source
Sonnet (Claude Sonnet 4.6)Model selectionAAA+ABuyfor: Any project; Sonnet vs Opus is the decision.Default for high-volume / context-heavy work — Sonnet is plenty for most coding tasks, Opus is overkill when the task isn't deep-reasoning.▸ install ▸ N/A — model selection, no install stepAnthropic — first-party model
Haiku (Claude Haiku 4.5)Model selectionAA+A+CBuyfor: Sub-agent worker pools, cheap audit hooks.Sub-agent workers and cheap-model audit hooks — Haiku is the floor; use it when "good enough fast" beats "deep and slow."▸ install ▸ N/A — model selection, no install stepAnthropic — first-party model
MCP servers
Tool Obs Cost Simp Corr Verdict Use case & source
Playwright MCPMCP server (stdio, runs locally)AAAA+Wrapfor: Internal product, multi-month horizon, business data.UI verification on internal product — real browser checks during development, no live customer data.▸ install ▸ npm install + npx ▸ Playwright binary install runs setup scripts (Microsoft-signed, pinnable by version, readable). Not a smell; the runtime surface is the issue.https://github.com/microsoft/playwright-mcp
Context7 MCPMCP server (hosted, HTTP)DCA+AVendorfor: Internal product, multi-month horizon, business data.Live library docs for fast-moving APIs during development — better than training-data memory; vendored copy planned for prod.▸ install ▸ Hosted, OAuth click-through ▸ no local code execution at adoption. Obs unaffected by install path; runtime opacity is the issue.https://github.com/upstash/context7
Slack MCPMCP server (hosted, OAuth bearer)CAABWrapfor: Internal product, multi-month horizon, business data.Ops/on-call automation — reading channel state and posting status updates, scoped to one bot user.▸ install ▸ Hosted, OAuth click-through ▸ no local code. Runtime auth-scope smell does the damage, not install.Anthropic-hosted MCP server, OAuth-based
Vercel MCPMCP server (hosted, OAuth bearer)CAABWrapfor: Internal product on Vercel, multi-month horizon, business data.Deploy automation in a Vercel-hosted project — log queries, deployment promotion, env var management.▸ install ▸ Hosted, OAuth click-through ▸ same shape as Slack.Vercel-hosted MCP server, OAuth-based
Gmail / Drive MCP (claude.ai-hosted)MCP server (hosted, OAuth bearer)DAA+ABuyfor: Personal use only — **forbidden for client / business data**.Personal email and document automation on a personal account — not business / client data.▸ install ▸ Hosted, claude.ai built-in ▸ no local install; broad OAuth scope is the runtime knock.Anthropic claude.ai built-in MCP
Mermaid MCPMCP server (local stdio)A+A+A+A+Buyfor: Any project that needs diagrams in the IDE.Rendering Mermaid diagrams for documentation and presentations without leaving Claude Code.▸ install ▸ Local install ▸ stdlib renderer; no postinstall surprises if installed from the canonical repo. Pinnable.Local Mermaid renderer exposed as an MCP server
Generic vendor-API MCPMCP server (hosted)DCACfor: Internal product moving toward production.Quick integration with a vendor service during development — to be replaced by an in-house wrapper before any production rollout.▸ install ▸ Varies ▸ if hosted: OAuth click-through. If local: typically npm installalways read the postinstall script before adopting. Often the actual supply-chain risk.Pattern — any hosted MCP server fronting a vendor's REST/GraphQL API
Plugins & frameworks
Tool Obs Cost Simp Corr Verdict Use case & source
pr-review-toolkitPlugin (6 specialized review sub-agents)ACCA+Wrapfor: Any project shipping PRs with non-trivial diffs.Pre-merge review for non-trivial PRs where catching silent failures and type errors before they ship justifies the per-review bill.▸ install ▸ /plugin install ▸ Anthropic-published plugin; no postinstall code. Markdown agents only.Anthropic-published plugin (in this marketplace)
obra/superpowersPlugin (sub-agent methodology — TDD + brainstorming + 2-stage review)ADDA+Wrapfor: Multi-month product where correctness justifies the bill.Production code where correctness justifies the bill — adoption of the methodology, not casual sampling.▸ install ▸ /plugin install ▸ markdown agents + commands. Readable. No code at install.https://github.com/obra/superpowers
SuperClaude_FrameworkPlugin (cognitive personas + structured workflows)ADDCDeferfor: Team committing to a unified house style.Teams adopting a unified cognitive style across all Claude Code sessions — house style is the deliverable.▸ install ▸ /plugin install + framework config ▸ heavier setup; review the persona files before adopting.https://github.com/SuperClaude-Org/SuperClaude_Framework
wshobson/agentsMarketplace (191 specialized sub-agents across 78 plugins)ACCAWrapfor: Any project with recurring specialist work.Cherry-picking 2–3 sub-agents that match recurring work in *this* project — **not** installing the whole marketplace.▸ install ▸ /plugin install (per sub-agent) ▸ markdown only; the risk is the 191-surface, not install code.https://github.com/wshobson/agents
ruvnet/claude-flowMulti-agent swarm orchestration (314 MCP tools)DFDDRejectfor: Anything beyond a research demo.Multi-agent swarm orchestration for an exploratory research demo.▸ install ▸ npm install + MCP server bootstrap ▸ 314-tool server runs at session start. Cannot fully audit at install time. −2 Obs.https://github.com/ruvnet/claude-flow
Around Claude Code
Tool Obs Cost Simp Corr Verdict Use case & source
ccusageLocal CLI / cost analyzerA+A+A+A+Buyfor: Every Claude Code project.Local cost analyzer — makes the Cost gate enforceable by surfacing the actual token spend per session.▸ install ▸ npm install -g ccusage ▸ single binary; no postinstall code at version-pinned releases. Readable. Pinnable.https://github.com/ryoppippi/ccusage
claudiaDesktop GUI / session manager (Tauri)AA+AABuyfor: Teams that prefer a dashboard alongside the CLI.Visual dashboard for a team that wants to see all running sessions and agent state at a glance, while keeping the CLI as the primary interface.▸ install ▸ Tauri desktop installer ▸ signed app bundle, standard macOS/Linux install flow. Readable source.https://github.com/getAsterisk/claudia
claude-code-routerCLI wrapper — routes requests to DeepSeek / Gemini / Groq / etc.DA+CDRejectfor: Anything beyond a personal hobby account.Routing Claude Code requests to alternative LLM vendors to cut cost on a personal/hobby account.▸ install ▸ npm install + config ▸ wrapper code itself is small and readable; the routing destinations are the runtime risk, not install.https://github.com/musistudio/claude-code-router
This plugin (dogfood)
Tool Obs Cost Simp Corr Verdict Use case & source
claude-tool-audit ▸ itselfPlugin (4 skills, 3 slash-command wrappers, 4 references, 2 scripts, 28+ examples)A+AAABuyfor: Any Claude Code project considering installing this plugin.Adopting this plugin to score other tools in a Claude Code project — installing it and running the three skills.▸ install ▸ git clone + /plugin install ▸ nothing executes at install. Markdown skills + stdlib Python scripts. Fully readable.https://github.com/mbrian23/claude-tool-audit