The Spectrum of Autonomy
AI agents exist on a spectrum from fully manual to fully autonomous. Each level implies a different interaction pattern and a different set of design requirements.
The right level of autonomy depends on task risk, user expertise, and the cost of failure. High-risk tasks need more human involvement; low-risk, repetitive tasks benefit from delegation.
| Level | Pattern | Human role | Example |
|---|---|---|---|
| 0 | Manual | Performs all actions | Standard form entry |
| 1 | Suggest | Chooses whether to accept | Autocomplete, spell check |
| 2 | Confirm-by-exception | Reviews only flagged items | Email filtering, fraud alerts |
| 3 | Execute with oversight | Monitors, can intervene | Meeting scheduling, code review bot |
| 4 | Full autonomy | Defines goal, reviews outcome | Automated report generation |
Confirm-by-Exception
The most practical pattern for delegating routine tasks. The agent executes actions within a defined scope unless it encounters ambiguity, low confidence, or a boundary condition — at which point it escalates to the user.
def process_invoice(invoice_data):
if confidence(invoice_data) > 0.95 and amount < APPROVAL_LIMIT:
auto_approve(invoice_data)
else:
escalate_to_user(invoice_data, reason="confidence_threshold")
Let the user set their own threshold. A power user might accept 90% confidence for speed; a compliance officer might require 99%.
Simulate-Then-Act
Before executing an action with lasting consequences, the agent shows the user what it intends to do and the expected outcome. The user can approve, modify, or reject.
This pattern is effective for:
- Bulk operations — “I will archive 47 emails. Here is a sample of 3.”
- Financial actions — “I will transfer $500 to Savings. Estimated balance after: $2,430.”
- Content publishing — “Here is the draft post. It will go live at 9 AM and reach approximately 2,000 followers.”
Progressive Autonomy
Trust is earned over time. Start the agent at a lower autonomy level and allow it to earn higher levels through demonstrated competence:
| Phase | Pattern | Duration | Criteria to advance |
|---|---|---|---|
| Observation | Shadow mode — agent suggests, user decides | 1–2 weeks | User acceptance rate > 90% |
| Limited | Confirm-by-exception with narrow scope | 2–4 weeks | Fewer than 5 escalations per week |
| Extended | Confirm-by-exception with wider scope | Ongoing | Error rate below threshold |
| Full | Execute with oversight; user sets goals only | Trust-based | Consistent performance over months |
Delegation with Constraints
Users should be able to delegate tasks with explicit guardrails:
- Scope constraints — “Only touch emails from external senders.”
- Time constraints — “Only act during business hours.”
- Value constraints — “Do not approve expenses over $200.”
- Approval chains — “If it involves legal review, escalate to compliance.”
Implicit constraints are invisible to users. If the agent has a hard-coded rule ("never delete events with more than 10 attendees"), make that rule visible and editable.
Multi-Turn and Stateful Interaction
Unlike search or Q&A, agent interactions are stateful. The agent remembers context across turns. Design implications:
- State visibility — show the user what the agent remembers about them.
- Context resets — provide a clear way to start fresh.
- Conversation branching — allow the user to explore alternatives without losing the primary thread.
- Interruptibility — the user should be able to interrupt the agent mid-task with a correction.
Key Takeaways
- Match autonomy level to task risk and user preference — default to lower autonomy for unfamiliar or high-stakes tasks.
- Confirm-by-exception is the most practical pattern for delegating routine work.
- Progressive autonomy builds trust gradually and gives users a sense of control.
- Make all constraints explicit, visible, and editable.
- Design for stateful, interruptible, multi-turn interaction — agents are not single-query tools.