Capabilities
A complete picture of what makes claw-forge the right harness for serious autonomous coding pipelines.
| Feature | Status | Details |
|---|---|---|
| Language & Installation | ||
| Language | โ | Pure Python โ no Node.js, no Bun, no npm |
| Package manager | โ | uv tool install claw-forge โ isolated, instant, no venv ceremony |
| PyPI package | โ | Also installable via pip install claw-forge |
| AI Provider Support | ||
| API rotation pool | โ | Round-robin and weighted routing across 7+ providers with automatic failover |
| Anthropic (direct) | โ | Direct API key or OAuth token from claude login |
| AWS Bedrock | โ | IAM credentials or instance role โ no key management |
| Azure AI Foundry | โ | Azure OpenAI-compat endpoint with managed identity support |
| Google Vertex AI | โ | ADC credentials, auto model name conversion (@ format) |
| Groq (free tier) | โ | 14,400 req/day free โ ideal for monitoring and lightweight tasks |
| Cerebras (free tier) | โ | 1M tokens/day free โ Llama 3.3 70B at near-instant inference |
| Ollama (local) | โ | Any locally-running model via Ollama's OpenAI-compat API |
| Anthropic-compat proxies | โ | Custom base_url with x-api-key auth โ works with any proxy |
| Circuit breaker per provider | โ | Closed โ Half-open โ Open state machine; auto-recovery after cooldown |
| Provider health dashboard | โ | Live health dots in Kanban UI โ RPM, latency, cost per provider |
| Per-provider cost tracking | โ | USD cost tracked per session and per provider |
| OAuth token support | โ | Auto-reads ~/.claude/.credentials.json; re-reads on 401 |
| Claude Agent SDK Integration | ||
| Bidirectional sessions | โ | ClaudeSDKClient โ mid-session follow-ups, model switching, interrupt |
| In-process MCP server | โ | Feature DB tools run in-process โ zero subprocess cold-start overhead |
| File checkpointing + rewind | โ | Rewind all files to any prior checkpoint without git |
| Pre-compact hook | โ | Custom compaction instructions preserve feature state across context limits |
| Structured JSON output | โ | Schema-enforced output from reviewer and planning agents |
| Thinking config | โ | Deep thinking for planning, adaptive for coding, disabled for monitoring |
| Named sub-agents | โ | AgentDefinition โ planner/coder/reviewer with separate prompts and tool sets |
| Cost cap per session | โ | max_budget_usd โ hard stop when budget is hit |
| Token-level streaming | โ | StreamEvent for typewriter effect in terminal UI |
| Architecture & Concurrency | ||
| Concurrency model | โ | Pure asyncio.TaskGroup โ no subprocess+threading mix |
| State management | โ | FastAPI REST + SQLAlchemy + WebSocket โ clean separation of concerns |
| Session hydration | โ | session_manifest.json survives restarts and process crashes |
| Plugin system | โ | pyproject.toml entry points โ third-party plugins without forking core |
| Dependency-aware scheduling | โ | Kahn's algorithm + DFS cycle detection โ features run in correct order |
| Security | ||
| Bash security hook | โ | Hierarchical allowlist: hardcoded blocklist โ global defaults โ project-specific |
| CanUseTool callback | โ | Programmatic permission control with input mutation before execution |
| OS-level sandbox | โ | SandboxSettings โ filesystem + network isolation at OS level (macOS/Linux) |
| Write restriction to project dir | โ | File writes sandboxed to the project directory automatically |
| Agent lock file | โ | .claw-forge.lock prevents duplicate agents on the same project |
| Skills & Built-in Tooling | ||
| LSP skills (Python) | โ | Pyright โ type checking, autocomplete, go-to-definition |
| LSP skills (Go) | โ | gopls โ full Go language intelligence |
| LSP skills (Rust) | โ | rust-analyzer โ borrow checker integration, refactoring |
| LSP skills (TypeScript) | โ | ts-server โ JS/TS type checking and navigation |
| LSP skills (Solidity) | โ | Solidity LSP โ smart contract analysis |
| LSP skills (C/C++) | โ | clangd โ C/C++ intelligence and formatting |
| Systematic debug skill | โ | Structured root-cause analysis workflow |
| Verification gate skill | โ | Run checks before claiming task complete |
| Parallel dispatch skill | โ | Route subtasks to parallel agents automatically |
| Frontend design skill | โ | Production-grade UI design guidance for web agents |
| Playwright browser skill | โ | Browser automation for web testing agents |
| Workflow Features | ||
| YOLO mode | โ | --yolo โ max concurrency, auto-approve permissions, skip verification |
| Pause / resume | โ | Drain mode: finish active features, then pause gracefully |
| Human input requests | โ | Agent raises needs_human flag; claw-forge input CLI unblocks it |
| Batch feature mode | โ | Implement multiple features per session with --batch-size |
| Slash commands (.claude/) | โ | create-spec, expand-project, check-code, checkpoint, review-pr, pool-status |
| Session resume | โ | Continue or fork any prior session by ID |
| Rate limit handling | โ | Parse retry-after headers, exponential backoff, auto-resume after cooldown |
| UI & Monitoring | ||
| Kanban board | โ | 5-column board: Pending / In Progress / Passing / Failed / Blocked |
| Provider health dots | โ | Green/amber/red per provider; click for RPM, latency, circuit state |
| Real-time WebSocket updates | โ | Feature status, agent events, cost โ live, no polling |
| Documentation & Quality | ||
| Tutorial website | โ | This site โ quickstart, provider setup, plugin guide, skills reference |
| SDK API guide | โ | 20 Claude Agent SDK APIs documented with claw-forge examples (docs/sdk-api-guide.md) |
| Test coverage โฅ 90% | โ | Enforced in CI โ 427 tests, all passing |
| Type annotations (strict) | โ | Full mypy strict โ no Any escapes in core modules |
| GitHub CI/CD | โ | Lint + typecheck + full test suite on every push and PR |
Roadmap
Honest about what's still in progress.
Distribute agent waves across multiple machines. The pool manager handles routing โ we need the distributed work queue layer.
Currently SQLite โ great for local development. Adding PostgreSQL and a hosted cloud option for team use.
Embed the Kanban UI directly in VS Code. See agent progress without leaving your editor.
Historical cost breakdown per project, per provider, per feature โ so you can optimize spend over time.