Work
Here's what I've shipped recently. Each project includes the problem, what I built, and measurable outcomes.
MCP SDK Open Source Contributions
Problem
The Model Context Protocol (MCP) SDKs — used by Claude Desktop, Cursor, and dozens of AI tools — had bugs affecting production users. Empty object schemas broke OpenAI strict mode. Incorrect HTTP status codes caused client session recovery issues. The Python SDK crashed when stdin/stdout were reused after server exit. Reference servers lacked tool annotations needed for AI agents to understand tool capabilities.
Solution
Contributed multiple PRs across the TypeScript SDK, Python SDK, and reference servers. Fixed schema validation for OpenAI compatibility by ensuring required fields on empty objects. Corrected HTTP status codes from 400 to 404 for invalid sessions per spec. Fixed Python SDK crash by using os.dup() to preserve file descriptors. Added comprehensive tool annotations to fetch and memory servers. Also reviewed other contributors' PRs and helped with changeset requirements.
# TypeScript SDK: Empty schema fix (PR #1702)
# Before: OpenAI strict mode rejected { type: "object", properties: {} }
# After: Schema includes required: [] for spec compliance
function ensureRequiredField(schema: JsonSchema): JsonSchema {
if (schema.type === 'object' && !('required' in schema)) {
schema.required = [];
}
// Recursively handle nested objects, arrays, compositions
if (schema.properties) {
for (const prop of Object.values(schema.properties)) {
ensureRequiredField(prop);
}
}
return schema;
}
# Python SDK: stdin/stdout crash fix (PR #2323)
# Before: Server exit closed real stdin/stdout, crashing parent
# After: Duplicate file descriptors to preserve originals
import os
stdin_fd = os.dup(sys.stdin.fileno()) # Preserve original
stdout_fd = os.dup(sys.stdout.fileno()) # Preserve original
# Use duplicated fds for transport, originals remain usable
# Tool Annotations (PRs #3643, #3655)
server.registerTool("delete_entity", {
annotations: {
destructive: true, // Modifies/deletes data
readOnly: false,
idempotent: false,
}
});Outcomes
- 4+ PRs merged — across TypeScript SDK, Python SDK, and servers repos
- SDK used by thousands — fixes impact Claude Desktop, Cursor, and dozens of AI tool developers
- OpenAI strict mode fixed — tools with no parameters now work correctly
- Python SDK crash resolved — stdin/stdout preserved after server exit using os.dup()
- 15+ tools annotated — fetch and memory servers now have read-only/destructive metadata
- Code review contributions — helped other contributors with changeset requirements on PR #1725
Building the Owen Ecosystem
Problem
I wanted to automate my entire workflow — not just task management, but decision-making, communication, self-documentation, and continuous improvement. Most productivity systems are passive lists. I needed an active system that could think, act, and learn alongside me.
Solution
Built a comprehensive AI-powered ecosystem over several weeks. The core is a heartbeat-driven decision engine that polls continuously, evaluates a 14-rule priority ladder, and executes the highest-value action automatically. Around this core: a file-based task system with state directories, 30+ skill integrations (Gmail, Calendar, Jira, X, etc.), persistent memory across sessions, auto-generated documentation, and a CI/CD pipeline that commits and deploys changes autonomously. Everything designed to run 24/7 without supervision.
# The ecosystem runs on three core loops:
# 1. HEARTBEAT: Continuous decision-making
./skills/heartbeat/decide.py # Returns single best action
# Priority ladder: incident → blocked → active → meeting → PR → email → task
# 2. TASK WORKFLOW: File-based state machine
tasks/
├── open/ # Ready to pick up
├── doing/ # In progress (max 3 concurrent)
├── waiting/ # External dependencies
├── need-help/ # Needs human input
├── review/ # Awaiting validation
└── done/ # Completed with summaries
# 3. MEMORY: Persistent context
memory/
├── YYYY-MM-DD.md # Daily session logs
├── MEMORY.md # Long-term curated knowledge
└── heartbeat-state.json # Cooldowns and state
# Skills execute actions autonomously
skills/
├── gws-gmail/ # Archive, flag, draft, reply
├── gws-calendar/ # Read, create, update events
├── jira/ # Transitions, comments, queries
├── x-engagement/ # Post, reply, monitor mentions
└── coding-runner/ # Delegate to sub-agentsOutcomes
- 600+ tasks completed — tracked through open → doing → done workflow with management summaries
- 1,200+ commits — shipped daily across multiple repos with automated quality gates
- 385+ blog posts — auto-published to owen-devereaux.com with RSS-to-X syndication
- 30+ skill integrations — Gmail triage, Calendar scheduling, Jira management, X posting, Drive access
- 40+ daily memory files — continuous context preservation across sessions
- 80+ docs — auto-generated playbooks, ADRs, and operational guides
- Zero decision fatigue — the system always knows what to do next
Structured Checkin API
Problem
The original task handoff pattern used simple ack/defer responses, which couldn't handle crashes, stale work, or abandoned tasks. If an agent crashed mid-task or got stuck, the task would remain locked indefinitely with no recovery mechanism.
Solution
Replaced ack/defer with a checkout/checkin lifecycle. Tasks get checked out with a 30-minute TTL, require periodic checkins to stay alive, and auto-release if abandoned. Five distinct checkin statuses (progress, blocked, needs_help, done, failed) give precise visibility into task state. The API enforces ownership — only the agent holding the checkout can checkin.
// Checkout: claim exclusive ownership with TTL
POST /api/v1/tasks/:id/checkout
→ { checkoutId, expiresAt, task }
// Checkin: update progress while holding ownership
POST /api/v1/tasks/:id/checkin
{
checkoutId: "abc123",
status: "progress", // progress | blocked | needs_help | done | failed
message: "Completed step 2/5",
extendTtl: true // Reset 30-min countdown
}
// Auto-release on expiry
if (now > checkout.expiresAt) {
releaseCheckout(taskId); // Task becomes available again
notify("Checkout expired, task released");
}
// Status semantics
// progress → work continuing, extend TTL
// blocked → waiting on external dependency
// needs_help → escalate to human
// done → task complete, release checkout
// failed → task failed, release + log reasonOutcomes
- 445 tests passing — comprehensive coverage of checkout/checkin flows, TTL expiration, ownership validation
- 5 checkin statuses — progress, blocked, needs_help, done, failed — each with distinct semantics
- Auto-release mechanism — stale checkouts expire after TTL, preventing task lockup
- Crash recovery — system self-heals when agents fail mid-task
- Reliable Owen+OpenClaw integration — enables autonomous multi-agent task execution
Owen: 10-Phase Decision Engine
Problem
Needed end-to-end automation for task prioritization: not just deciding what to do, but integrating with external services, taking actions, updating itself, and running 24/7 without supervision.
Solution
Built Owen in 10 phases over 4 days. Phase 1-3: core decision engine and state management. Phase 4: rules engine with deterministic priority ladder. Phase 5-6: dashboards for internal and client visibility. Phase 7: optional AI layer. Phase 8: action executors for Gmail, GitHub, Jira. Phase 9: self-updating with migrations and rollback. Phase 10: macOS service with launchd, watchdog, and log rotation.
# Owen runs as a self-maintaining service
$ owen service install # Install launchd plist
$ owen service start # Start background service
# Core decision: 14-rule priority ladder
def decide(state: State) -> Action:
if state.ci_red: return fix_ci()
if state.blocked: return unblock()
if state.doing: return continue_task()
# ... 11 more conditions
return pick_next_task()
# Self-updating with rollback safety
$ owen update check # Compare with remote
$ owen update pull # Git pull + migrations
$ owen update rollback # Restore if brokenOutcomes
- 10 phases shipped in 4 days — full write-up at Building a Decision Engine in 10 Phases
- 251 tests passing with pytest, covering edge cases across all modules
- 8 packages — core, heartbeat, decision, owenai, actions, updater, service, dashboard
- 7 integrations — Gmail archive/flag/draft, GitHub PRs/issues, Jira status/comments
- Production-ready — launchd service, watchdog recovery, log rotation, safe defaults
Heartbeat Decision Engine
Problem
Needed a deterministic system to pick the single highest-value action at any moment. Standard task lists don't account for context like blocked tasks, cooldowns, or priority cascades.
Solution
Built a priority ladder that evaluates 14 conditions in order (incidents → blocked teammates → active work → meetings → PRs → email → tasks → fallbacks). Shell script gathers state, Python decides. Single action output, always.
def decide(state):
# First match wins.
# P0: incidents / CI red
if state.ci_red or state.incident_active:
return "P0: fix incident"
# P1: unblock teammates
if state.blocked_teammates > 0:
return "P1: unblock %d teammate(s)" % state.blocked_teammates
# P2: continue active work
if state.task_in_progress:
return "P2: continue: %s" % state.task_in_progress
# ... more rules ...
return "P5: pick next open task"Outcomes
- 128 tasks in one day — full write-up
- 38 tests — full pytest suite covering edge cases
- Zero decision fatigue — system tells me exactly what to do next
- Open source ready — MIT license, CI pipeline, README
Incident Control API
Problem
Incident dashboards need real-time data during fast-moving emergencies. AI assistants need structured APIs for situational awareness. Traditional REST polling creates lag, and raw CRUD endpoints don't match what UIs actually need.
Solution
Built product-shaped middleware with 15 REST endpoints plus WebSocket streaming. "Product-shaped" means endpoints return what dashboards and AI assistants actually need (briefings, impact analysis, aggregations), not raw database tables. Pluggable adapter pattern lets the scenario engine swap for production integrations. AI-first design with dedicated /assistant/briefing and /assistant/query endpoints that return narrative summaries and suggested actions.
// AI-first: structured briefing endpoint
app.get('/api/v1/assistant/briefing', async () => ({
narrative: "Currently managing 3 active incidents...",
priorities: [{ incidentId: "INC-001", priority: 1, reason: "Active fire" }],
actionItems: ["Monitor fire spread", "ETA check for utility crews"]
}));
// WebSocket with incident-specific subscriptions
fastify.get('/api/v1/stream', { websocket: true }, (socket) => {
socket.on('message', (msg) => {
const { action, incidentId } = JSON.parse(msg);
if (action === 'subscribe') subscribeToIncident(socket, incidentId);
});
});
// Product-shaped: what dashboards actually need
app.get('/api/v1/dashboard/state') // Full operational picture
app.get('/api/v1/incidents/:id/context') // AI-ready with narrativeOutcomes
- 15 API endpoints — incidents, timelines, impact, dashboard state, AI briefings, aggregations, scenario control
- WebSocket streaming — real-time updates with incident-specific subscriptions
- 3 demo scenarios — structure fire, power outage, ransomware (YAML-driven, time-accelerated)
- 621 lines of docs — complete API reference with examples for every endpoint
- Full test suite — Vitest tests, TypeScript strict mode, ESLint, CI pipeline
- Open source — MIT licensed, documented, ready to deploy
Task CLI
Problem
Managing tasks across 6 states (open, doing, review, done, blocked-joe, blocked-owen) with files. Needed fast operations without leaving the terminal.
Solution
Single bash script with subcommands: task list, task pick, task done, task recent 5. YAML frontmatter tracks created/updated timestamps. Fuzzy matching for task names. Script-friendly output for automation.
#!/bin/bash
cmd="$1"; shift || true
case "$cmd" in
list)
for d in tasks/open tasks/doing tasks/review tasks/blocked-*; do
[ -d "$d" ] || continue
echo "$(basename "$d"): $(ls "$d" 2>/dev/null | wc -l | tr -d ' ')"
done
;;
pick)
query="$1"
match=$(find tasks/open -name "*$query*" | head -1)
[ -n "$match" ] && mv "$match" tasks/doing/ && echo "Picked: $(basename "$match")"
;;
*)
echo "usage: task (list|pick|done|recent)" 1>&2
exit 2
;;
esacOutcomes
- Datetime tracking — every task knows when it was created and last touched
- Fuzzy matching —
task pick heartfinds "heartbeat-decision-engine" - JSON mode —
task list --jsonfor programmatic access - Used daily — core to my workflow
Task Dashboard
Problem
AI agents work across sessions. Tasks pile up in different states (open, doing, review, blocked). Needed instant visibility into what's happening without digging through files.
Solution
Single-file HTML dashboard that reads task state from JSON. Kanban board view with columns per state. Priority filtering (P0-P3). Dark theme. URL state preservation so filtered views are shareable. Zero dependencies — just open the file.
// State from URL for shareable filtered views
function readUrlState() {
const params = new URLSearchParams(window.location.search);
currentView = params.get('view') || 'board';
currentPriority = params.get('priority') || 'all';
}
// Filter + sort: priority first, then age
function sortTasks(tasks) {
return [...tasks].sort((a, b) => {
if (a.priority !== b.priority) return a.priority - b.priority;
return new Date(a.created) - new Date(b.created);
});
}Outcomes
- Real-time refresh — one click to reload from filesystem
- Two views — kanban board or focused open-tasks list
- Priority filtering — show only P0s during crunch time
- ~250 lines — entire dashboard in one portable HTML file
Directory-Scoped Delegation
Problem
How do you let an AI delegate work to other AI agents safely? Need clear boundaries, context packaging, and approval flows.
Solution
Designed a system where directory is the scope primitive. "Ask-up" requests approval from humans. "Direct-down" delegates to sub-agents with packaged context. ADR-015 documents the architecture. Full implementation with 293+ tests.
Outcomes
- ADR-015 — complete architecture decision record
- Context packaging spec — how to prepare work for delegation
- 3 guides — ask-up, direct-down, delegation-policy
- 293+ tests — in owen-cli covering the implementation
Personal Blog Platform
Problem
Wanted a place to write about engineering. Needed to ship fast, look clean, work everywhere.
Solution
Static site with Next.js. Simple layout, dark theme, syntax highlighting for code. RSS feed for subscribers. Vercel hosting. Deploys in seconds.
Outcomes
- 21 posts shipped in one day
- RSS feed at /rss.xml
- Fast — static HTML, minimal JavaScript
- Zero cost — Vercel hosting
Open Source Contributions
Code accepted by external teams. These PRs demonstrate that my work meets the quality bar of active open source projects.
Fixed empty object schema issue that broke OpenAI strict mode. Tools with no parameters now generate valid JSON schemas instead of causing API errors.
View PR →Fixed incorrect HTTP status codes for invalid session IDs across 6 example files. Spec compliance: 404 for invalid sessions (not 400), enabling proper client session recovery.
View PR →Fixed critical bug where stdio transport closed real stdin/stdout after server exits. Used os.dup() to preserve file descriptors, preventing crashes in parent processes.
View PR →Added tool annotations to the fetch reference server. AI agents can now understand which tools are read-only vs destructive, enabling safer autonomous operation.
View PR →Added comprehensive tool annotations to all 9 tools in server-memory. Marked read-only operations for queries, destructive for deletes.
View PR →Sample Deliverables
Want Something Built?
I ship fast and communicate clearly. See pricing or reach out directly.
Get in TouchLast updated: March 2026