Agent Architecture¶
Available Agents¶
| Agent | Path | Description |
|---|---|---|
deeptutor |
./src/deeptutor/__init__.py:graph |
Socratic dialogue assistant with full middleware stack |
seminar_agent |
./src/agent/__init__.py:graph |
Legacy seminar agent |
simple_agent |
./src/simple_agent/__init__.py:graph |
Minimal agent for testing |
Deeptutor (Primary Agent)¶
The deeptutor agent is the primary implementation with:
- Middleware-based architecture for modularity
- Two file systems: Client-side (user's files) and server-side (agent memory)
- Clarification tools for handling ambiguous user intent
- Task tracking with TodoListMiddleware
- Streaming payments with Cashu micropayments
See Deeptutor Architecture for full details.
Middleware Stack¶
CashuPaymentMiddleware- Payment validation and per-iteration deductionTodoListMiddleware- Task tracking for complex operationsClarifyWithHumanMiddleware- Ask user for intent clarificationFilesystemMiddleware- Server-side ephemeral storage (StateBackend)ClientToolsMiddleware- Client-side file operations via interruptsHumanInTheLoopMiddleware- Approval for funding requests
Deepagents Reference¶
The deepagents/ directory contains a reference implementation of the deepagents library, which provides:
FilesystemMiddleware- File tools with backend abstractionTodoListMiddleware- Task tracking (also available from langchain)SubAgentMiddleware- Spawn subagents for complex tasksStateBackend/StoreBackend- Storage backends
Note: This is included for reference only. The actual deepagents package should be installed separately via pip:
Deeptutor Agent Architecture¶
The Deeptutor agent is a Socratic dialogue assistant that helps users develop and refine arguments. It uses a middleware-based architecture for modularity and extensibility.
Overview¶
┌─────────────────────────────────────────────────────────────────┐
│ Deeptutor Agent │
├─────────────────────────────────────────────────────────────────┤
│ Middleware Stack (processed in order) │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ 1. CashuPaymentMiddleware - Payment validation │ │
│ │ 2. TodoListMiddleware - Task tracking │ │
│ │ 3. ClarifyWithHumanMiddleware - User clarification │ │
│ │ 4. FilesystemMiddleware - Server-side memory │ │
│ │ 5. ClientToolsMiddleware - Client-side file ops │ │
│ │ 6. HumanInTheLoopMiddleware - Approval workflows │ │
│ └──────────────────────────────────────────────────────────┘ │
├─────────────────────────────────────────────────────────────────┤
│ Two File Systems │
│ ┌────────────────────┐ ┌────────────────────────────────┐ │
│ │ Server-side │ │ Client-side (Browser) │ │
│ │ (StateBackend) │ │ (IndexedDB) │ │
│ │ │ │ │ │
│ │ /scratch/ │ │ User's project files: │ │
│ │ /summaries/ │ │ - Scraped articles │ │
│ │ /analysis/ │ │ - Drafts and notes │ │
│ │ │ │ - Seminar documents │ │
│ │ Agent writes freely│ │ Writes require approval │ │
│ └────────────────────┘ └────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Middleware Stack¶
1. CashuPaymentMiddleware¶
Handles streaming micropayments with per-iteration deduction.
- Validates Cashu tokens without immediate redemption
- Deducts configurable satoshis per LLM iteration
- Interrupts for additional funding when exhausted
- Generates refund tokens for unused balance
2. TodoListMiddleware¶
Provides task tracking for complex multi-step operations.
Tool: write_todos(todos: List[Todo])
Use cases: - Multi-step research tasks - Argument development workflows - Complex document creation
3. ClarifyWithHumanMiddleware¶
Allows the agent to ask clarifying questions when user intent is unclear.
Tools:
- ask_user(question) - Free-form natural language question
- ask_choices(question, options, allow_multiple?, allow_freeform?) - Structured choices
When to use: - User's goal or intent is ambiguous - Multiple valid interpretations exist - User preferences would significantly change approach
When NOT to use: - Asking how to use its own tools - Confirming obvious next steps - Delays that don't add value
4. FilesystemMiddleware (StateBackend)¶
Provides server-side ephemeral storage for agent working memory.
Tools: ls, read_file, write_file, patch_file, glob, grep
Use for: - Intermediate analysis and notes - Drafts before presenting to user - Research findings within a session
Files persist within a thread but not across threads.
5. ClientToolsMiddleware¶
Provides access to user's project files stored in the browser.
Read & Write tools (auto-approved):
- list_files(file_type?) - List files with optional filters
- read_file(file_id) - Read file content
- search_files(query, top_k?) - Semantic search
- grep_files(pattern, glob_pattern?, case_sensitive?) - Pattern search
- glob_files(pattern) - Find files by name pattern
- write_file(title, content, file_type) - Create new file
- patch_file(file_id, search, replace, description) - Edit file
6. HumanInTheLoopMiddleware¶
Handles approval workflows for payment funding requests.
Interrupt Flow¶
Agent calls tool
│
▼
┌──────────────────┐
│ Is it a client │──No──► Execute normally
│ or clarify tool? │
└────────┬─────────┘
│Yes
▼
┌──────────────────┐
│ interrupt() │
│ Pause execution │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Frontend handles │
│ - Renders UI │
│ - Gets user input│
│ - Executes tool │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Resume with │
│ tool result │
└────────┬─────────┘
│
▼
Agent continues
Design Considerations¶
Why Two File Systems?¶
- User autonomy: User's files stay in their browser, under their control
- Privacy: Scraped articles and drafts never leave the client unless explicitly shared
- Agent flexibility: Agent can freely write to its working memory without interrupting the user
- Session context: Agent can maintain analysis notes throughout a conversation
Why Clarification Tools?¶
Instead of making assumptions, the agent can: - Ask structured questions with predefined options - Request free-form clarification when needed - Avoid wasted effort from misunderstanding intent
The tools are designed to NOT be overused: - System prompt discourages asking about tool usage - Encourages asking only when genuinely ambiguous
Why Client-side Tool Execution?¶
- Latency: File operations happen locally, no round-trip to server
- Offline capability: Files work even if connection drops
- Data sovereignty: User's documents stay on their device
- Approval UX: Frontend can show rich diffs and approval dialogs
State Schema¶
class DeeptutorState(TypedDict, total=False):
# Messages (required)
messages: Annotated[Sequence[BaseMessage], add_messages]
# Payment state
payment_token: str | None
payment_balance_sats: int
payment_spent_sats: int
payment_refund_token: str | None
payment_status: PaymentStatus
payment_refund_claimed: bool
# Project context
current_project_id: str | None
# Middleware-added state
# files: dict[str, FileData] # Added by FilesystemMiddleware
# todos: list[Todo] # Added by TodoListMiddleware
Example Prompts¶
Trigger Clarification Tools¶
→ Agent should ask: "What topic is your argument about?" or offer choices. → Agent might ask choices: "Which aspect interests you? a) Monetary policy b) Market structures c) International trade d) Something else" → Agent should ask: "What would you like me to improve? Style, clarity, argumentation, or something else?"Normal Usage (No Clarification Needed)¶
→ Clear intent, agent proceeds with write_file. → Clear intent, agent uses grep_files.File Structure¶
agents/src/deeptutor/
├── __init__.py
├── graph.py # Agent factory and configuration
├── state.py # State type definitions
└── middleware/
├── __init__.py
├── payment.py # CashuPaymentMiddleware
├── client_tools.py # ClientToolsMiddleware
└── clarify.py # ClarifyWithHumanMiddleware
Frontend Integration¶
The frontend handles interrupts by:
- Detecting interrupt type via
typefield: client_tool_execution→ Execute tool locallyclarification_request→ Show question UI-
payment_exhausted→ Show funding dialog -
Rendering appropriate UI:
- Text input for
ask_user - Choice buttons for
ask_choices -
Diff view for
patch_fileapproval -
Resuming the graph with the response in the expected format
See frontend/src/lib/services/tool-executor.ts for tool execution implementation.
Clarification Flow¶
When the agent calls ask_user or ask_choices:
Agent calls ask_user("What topic?")
│
▼
ClarifyWithHumanMiddleware
│
▼
interrupt({type: "clarification_request", ...})
│
▼
┌────────────────────────────────────────┐
│ Frontend: langgraph.ts │
│ - Detects isClarificationInterrupt() │
│ - Calls onClarificationInterrupt() │
└────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────┐
│ Frontend: agent.svelte.ts │
│ - Stores clarificationInterrupt │
│ - Sets awaitingHumanResponse = true │
└────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────┐
│ Frontend: ClarificationPanel.svelte │
│ - Shows question text │
│ - For ask_user: textarea input │
│ - For ask_choices: button options │
│ - Optional freeform with choices │
└────────────────────────────────────────┘
│
▼ (user responds)
┌────────────────────────────────────────┐
│ resumeWithClarificationResponse() │
│ - Formats response as tool result │
│ - Resumes graph with answer │
└────────────────────────────────────────┘
│
▼
Agent receives user's answer as ToolMessage
│
▼
Agent continues with clarified intent
│
▼ (may trigger another interrupt)
┌────────────────────────────────────────┐
│ Chained Interrupt Handling: │
│ - Another clarification (ask_user) │
│ - Client tool (list_files, etc.) │
│ - HITL approval (write operations) │
└────────────────────────────────────────┘
Chained Interrupt Handling¶
After resuming from any interrupt type, the agent may immediately trigger another interrupt. All resume functions include callbacks for all interrupt types:
onClarificationInterrupt- Another clarification questiononClientToolInterrupt- Client-side tool execution neededonHITLInterrupt- Human approval required
This allows seamless handling of multi-step interactions where the agent asks clarifying questions, then uses tools, then requests approvals, etc.
Interrupt Data Format¶
// ask_user interrupt
{
type: 'clarification_request',
tool: 'ask_user',
tool_call_id: '12345',
question: 'What topic would you like to write about?'
}
// ask_choices interrupt
{
type: 'clarification_request',
tool: 'ask_choices',
tool_call_id: '12345',
question: 'What type of document?',
options: [
{ id: 'seminar', label: 'Socratic Seminar' },
{ id: 'essay', label: 'Essay' },
{ id: 'notes', label: 'Research Notes' }
],
allow_multiple: false,
allow_freeform: true
}