From goal to merged PR - with the reasoning intact.
Penling structures work into specs, hands the specs to AI for execution, and keeps the full evolutionary history of every decision along the way. Here's what that looks like in practice.
Goals get broken into focus areas. Focus areas are the spec.
A focus area isn't a ticket or a story. It's a structured contract: what you're building, what a passing result looks like, under what conditions, and what's out of scope.
Every build that follows inherits this contract. The agent can't start without it - and neither can the reviewer who reads the PR.
- Definition - what you're building, in plain English.
- Expected results - numbered, testable, no vagueness.
- Conditions - the environment and constraints the agent must satisfy.
An LLM proposes an implementation plan. You shape it before it ships.
The AI drafts a step-by-step strategy from the focus area. It doesn't guess - it derives from the spec: what to build, what order to do it in, what to verify along the way.
Before anything runs, you review. Human edits are tracked and attributed. When you publish, the plan becomes the agent's mandate.
- Human edits are tracked and shown in the plan history.
- Multiple plans can be drafted - only the published one runs.
- Plan version is locked to the build that executes it.
Any MCP-compatible agent picks up the plan and starts building.
Penling exposes four MCP tools. Drop the server config into your agent runtime - Claude Code, VS Code, Cursor, your own agent framework - and it connects automatically.
No lock-in. One Penling server, any runtime. The agent claims the build, streams progress, resolves checks, and asks a human if it gets stuck.
- One Penling server, any MCP-compatible agent.
- Builds out in your runtime - your machine, your cloud.
- Clarifications pause the agent until a human responds.
{ "mcpServers": { "penling": { "command": "npx", "args": [ "@penling/mcp-server", "--build", "19" ], "env": { "PENLING_TOKEN": "${PENLING_TOKEN}" } } } } // Four tools your agent gets: // claim_build - lock the build before writing code // report_event - stream progress back to Penling // resolve_check - close an acceptance check with evidence // open_clarification - pause and ask a human a question
Every move the agent makes streams back to Penling.
As the agent works, the MCP server reports events back: commits, test runs, check resolutions, clarifications, PR opens. Everything is timestamped and attributed.
Subscribe via the Penling API and you can watch a build evolve in real time - in Penling, in Slack, in Datadog, in your own dashboard.
- Webhooks for every event type.
- Streaming subscriptions via Server-Sent Events.
- Replayable history - scrub pages from any point.
curl -H "Authorization: Bearer $TOKEN" \ -H "Accept: text/event-stream" \ https://api.penling.dev/v1/builds/19/events // Every build event streams in real time. // Subscribe in Penling, Slack, or Datadog.
A PR opens, verified against the spec, with every contributor named.
When the build closes, Penling locks the evidence and opens a PR. Every check traces to a file. Every commit traces to a human or an AI. The reviewer sees the spec, the plan, and the proof - all in one place.
One workflow. End-to-end documented.
Spec per focus area
Definition, results, conditions, and boundaries - every time.
Plans per spec
Draft as many as you need. Only the published one runs.
Native agent runtime
Any MCP-compatible agent. No wrapper, no lock-in.
Available
Every event streams. Every build is replayable from any point.
Ready to ship with structure?
Define your work focus area, generate an implementation plan, connect our MCP agent, and watch a validated Pull Request land in your repository.