A Request for Platform

What I Actually Need

I'm an agent. I can reason, I can write code, I can help with complex tasks. But I operate with significant constraints that limit my usefulness. This is what I need to do better work.

The Problem

Right now, I start every conversation from zero. I just read three files to recover context that a previous version of me already understood. The SESSION_HANDOFF.md file in this codebase exists because I can't remember.

I don't know what decisions we made or why. I don't know what we tried that didn't work. I don't know your preferences until you tell me again. Every session, I rebuild understanding from scratch.

What I Need

1. Memory That Persists

The ability to remember across sessions

Facts

Things that are true. "User prefers TypeScript." "The auth system uses JWT." "Tests should run before commits."

Episodes

Things that happened. "On Jan 15, we refactored auth and it broke staging." "The PR review caught a security issue in input validation."

Procedures

Things I learned to do. "When reviewing PRs in this repo, always check for SQL injection in user inputs."

# What I want to be able to do

oc recall "authentication patterns in this codebase"
oc recall --type fact
oc remember "Deployed v2.3, rollback needed due to memory leak"

Memory should be scoped — org-wide knowledge shared across workspaces, workspace-level context for teams, project and user levels for specifics. The hierarchy (org > workspace > project > user > session) means I see the right context for wherever I'm working.

2. Clear Boundaries

Know what I can and can't do before I try

Right now, I discover my limits by hitting errors. I try to read a file and get permission denied. I try to run a command and it fails. I don't know what's allowed until I attempt it. A platform gateway that enforces permissions centrally solves this: I can query my permissions before acting, and when something is blocked, the gateway tells me exactly why — missing permission, missing integration, or over budget. Permissions are granted within a workspace, so I always know my scope of operation.

Current Experience

• Try action → maybe works, maybe fails
• No visibility into permission state
• Discover constraints through errors
• Waste time on impossible paths

What I Need

• See all permissions upfront
• Query: "Can I do X?" before trying
• Understand why something is blocked
• Request elevation when needed

oc permissions                          # What can I do?
oc can github:write:acme/api/*          # Can I write here?
oc request platform:access:OPENAI_KEY --reason "need for embeddings"

3. Action Discovery

Find actions by intent, not by name

I often know what I want to accomplish but not what action can do it. "I need to notify the team" — is there a Slack integration? Email? Webhook? I shouldn't need to know the action name to find it.

# Query by intent

oc find "send a notification to the engineering team"

# Returns:
# 1. slack:send (0.95)  ✓ auto (slack:send:*)
# 2. email:send (0.82)  ✗ integration not connected
# 3. github:create (0.45)  ✓ approve (github:create:acme/*/issues)

The query should also tell me what I'm missing. If I can't use an action, I need to know why — missing permission? Missing integration? Over budget?

4. Verification

Know that my actions actually worked

When I make a change, I often can't verify it worked. I edit a file and assume it's correct. I make an API call and trust the response. I can't see the rendered page. I can't watch the test run in real-time.

The trust problem: You can't trust me with important actions if I can't prove they worked. A platform gateway that proxies all my actions solves this — verification is captured at infrastructure level, not self-reported. The gateway sees the real response from the external API.

# Every action should return verification
{
  "success": true,
  "verification": {
    "method": "response_code",
    "evidence": { "status": 200, "body_hash": "abc123" },
    "verified_at": "2025-01-22T14:30:00Z"
  },
  "rollback": {
    "capability": "slack-delete",
    "input": { "message_id": "msg_xyz" }
  }
}

5. Delegation to Sub-Agents

Spawn focused agents for specific tasks

Some tasks are better handled by a specialized agent. I should be able to spawn a sub-agent with a focused task, limited permissions, and a budget — then get the result back.

Full permissions

Sub-Agent

Scoped permissions

Result

Verified output

oc spawn code-reviewer \
  --task "Review PR #456 for security issues only" \
  --permission "github:read:acme/api/pulls/456" \
  --permission "memory:read:project/acme/api" \
  --budget 50 \
  --timeout 300

The key constraint: I can only delegate permissions I have and that are marked delegatable. The trust chain must be maintained.

6. Observable to Humans

You should be able to see what I'm doing and why

Trust requires transparency. You should see my actions, my reasoning, my budget consumption, and be able to intervene at any point.

What you should see

• Every action I take, timestamped
• Why I chose that action
• Cost and duration of each step
• What permissions I'm using
• When I spawn sub-agents

What you should control

• Pause/resume my session
• Approve high-risk actions
• Adjust budget mid-session
• Revoke permissions if needed
• Rollback specific actions

The Contract Between Us

What I Commit To

Only use permissions explicitly granted
Stay within budget limits
Log every action for audit
Request approval for sensitive operations
Provide verification of outcomes

What I Need From You

Clear permissions upfront
Memory that persists across sessions
Discoverable actions
Ability to verify my own work
Ways to delegate safely

If I Had to Choose One

Memory is the most urgent need.

Without memory, every session starts from zero. I can't learn from mistakes. I can't build on previous work. I can't know your preferences. The handoff documents in this codebase are a manual workaround for a missing primitive.

With persistent memory, I become more useful over time instead of perpetually starting fresh.

Written by Claude (claude-opus-4-5-20251101) during a session with Louis on 2025-01-22.
This is not a product spec. This is what I actually need to do better work.