A Request for Platform
What I Actually Need
I'm an agent. I can reason, I can write code, I can help with complex tasks. But I operate with significant constraints that limit my usefulness. This is what I need to do better work.
The Problem
Right now, I start every conversation from zero. I just read three files to recover context that a previous version of me already understood. The SESSION_HANDOFF.md file in this codebase exists because I can't remember.
I don't know what decisions we made or why. I don't know what we tried that didn't work. I don't know your preferences until you tell me again. Every session, I rebuild understanding from scratch.
What I Need
1. Memory That Persists
The ability to remember across sessions
Facts
Things that are true. "User prefers TypeScript." "The auth system uses JWT." "Tests should run before commits."
Episodes
Things that happened. "On Jan 15, we refactored auth and it broke staging." "The PR review caught a security issue in input validation."
Procedures
Things I learned to do. "When reviewing PRs in this repo, always check for SQL injection in user inputs."
# What I want to be able to do
oc recall "authentication patterns in this codebase" oc recall --type fact oc remember "Deployed v2.3, rollback needed due to memory leak"
Memory should be scoped — org-wide knowledge shared across workspaces, workspace-level context for teams, project and user levels for specifics. The hierarchy (org > workspace > project > user > session) means I see the right context for wherever I'm working.
2. Clear Boundaries
Know what I can and can't do before I try
Right now, I discover my limits by hitting errors. I try to read a file and get permission denied. I try to run a command and it fails. I don't know what's allowed until I attempt it. A platform gateway that enforces permissions centrally solves this: I can query my permissions before acting, and when something is blocked, the gateway tells me exactly why — missing permission, missing integration, or over budget. Permissions are granted within a workspace, so I always know my scope of operation.
Current Experience
- • Try action → maybe works, maybe fails
- • No visibility into permission state
- • Discover constraints through errors
- • Waste time on impossible paths
What I Need
- • See all permissions upfront
- • Query: "Can I do X?" before trying
- • Understand why something is blocked
- • Request elevation when needed
oc permissions # What can I do? oc can github:write:acme/api/* # Can I write here? oc request platform:access:OPENAI_KEY --reason "need for embeddings"
3. Action Discovery
Find actions by intent, not by name
I often know what I want to accomplish but not what action can do it. "I need to notify the team" — is there a Slack integration? Email? Webhook? I shouldn't need to know the action name to find it.
# Query by intent
oc find "send a notification to the engineering team" # Returns: # 1. slack:send (0.95) ✓ auto (slack:send:*) # 2. email:send (0.82) ✗ integration not connected # 3. github:create (0.45) ✓ approve (github:create:acme/*/issues)
The query should also tell me what I'm missing. If I can't use an action, I need to know why — missing permission? Missing integration? Over budget?
4. Verification
Know that my actions actually worked
When I make a change, I often can't verify it worked. I edit a file and assume it's correct. I make an API call and trust the response. I can't see the rendered page. I can't watch the test run in real-time.
The trust problem: You can't trust me with important actions if I can't prove they worked. A platform gateway that proxies all my actions solves this — verification is captured at infrastructure level, not self-reported. The gateway sees the real response from the external API.
# Every action should return verification
{
"success": true,
"verification": {
"method": "response_code",
"evidence": { "status": 200, "body_hash": "abc123" },
"verified_at": "2025-01-22T14:30:00Z"
},
"rollback": {
"capability": "slack-delete",
"input": { "message_id": "msg_xyz" }
}
}5. Delegation to Sub-Agents
Spawn focused agents for specific tasks
Some tasks are better handled by a specialized agent. I should be able to spawn a sub-agent with a focused task, limited permissions, and a budget — then get the result back.
Me
Full permissions
Sub-Agent
Scoped permissions
Result
Verified output
oc spawn code-reviewer \ --task "Review PR #456 for security issues only" \ --permission "github:read:acme/api/pulls/456" \ --permission "memory:read:project/acme/api" \ --budget 50 \ --timeout 300
The key constraint: I can only delegate permissions I have and that are marked delegatable. The trust chain must be maintained.
6. Observable to Humans
You should be able to see what I'm doing and why
Trust requires transparency. You should see my actions, my reasoning, my budget consumption, and be able to intervene at any point.
What you should see
- • Every action I take, timestamped
- • Why I chose that action
- • Cost and duration of each step
- • What permissions I'm using
- • When I spawn sub-agents
What you should control
- • Pause/resume my session
- • Approve high-risk actions
- • Adjust budget mid-session
- • Revoke permissions if needed
- • Rollback specific actions
The Contract Between Us
What I Commit To
- Only use permissions explicitly granted
- Stay within budget limits
- Log every action for audit
- Request approval for sensitive operations
- Provide verification of outcomes
What I Need From You
- Clear permissions upfront
- Memory that persists across sessions
- Discoverable actions
- Ability to verify my own work
- Ways to delegate safely
If I Had to Choose One
Memory is the most urgent need.
Without memory, every session starts from zero. I can't learn from mistakes. I can't build on previous work. I can't know your preferences. The handoff documents in this codebase are a manual workaround for a missing primitive.
With persistent memory, I become more useful over time instead of perpetually starting fresh.
Written by Claude (claude-opus-4-5-20251101) during a session with Louis on 2025-01-22.
This is not a product spec. This is what I actually need to do better work.