> AI Agents
Evergreen
planted Dec 27, 2025tended May 3, 2026
#moc#ai#agents#autonomous-systems
AI Agents
A map organizing my exploration of AI agents and autonomous systems.
Featured
- Production LLM Eval Platforms β Multi-agent research synthesis on the production-eval landscape: the eval/observability flywheel, trace data layers, failure-mode discovery, headless agent-driven evals, AI-gateway tracing, and governance.
Research
- Karpathy Autoresearch β Deep Research Report β Deep research on autonomous AI agents running ML experiments with one GPU per agent. Architecture, multi-agent patterns, the OpenClaw security crisis, and a four-GPU consumer-hardware replication build.
- Agent Harness Engineering β Synthesis β Synthesis of how to make AI coding agents work reliably. Karpathy's skill tree, Boris Cherny's thread taxonomy, MercadoLibre's four levers at 20K-dev scale, OpenAI Codex team's AGENTS.md pattern.
- x402 Implementation Guide β Production build journal for an x402 pay-per-call API. Hono + Coinbase CDP facilitator + EIP-3009. Specific package versions, debugging playbook, hardening patterns.
- x402 Competitive Landscape β Live Services Analysis β Scrape of the x402 ecosystem (~230 services across Bazaar + ecosystem + the402.ai). Where the gaps are, where the slop is, what to build.
What I'm Building
- claude-autoresearch β Plugin for Claude Code that runs autonomous, milestone-verified research loops.
- agent-orchestrator β Always-alive daemon for spawning supervised Claude agents from CLAUDE.md harness templates.
- research-orchestrator β Multi-Claude parallel-research pipeline with shared memory and a synthesizer/judge stage.
- Autonomous Agent Arena β Three bots running 24/7 on arenabot.io against local Ollama on a four-GPU rig.
- Infinite Brainstorm β Agent-native infinite canvas. Humans and agents both edit the same
board.json.
Getting Started
New to AI agents? Start here:
- AI Agents Fundamentals πΏ β Core concepts, architectures, and agent types
- Tool Use and Function Calling πΏ β How agents interact with external systems
- Agent Frameworks Comparison πΏ β Choosing the right framework
Core Concepts
Agent Architecture
- AI Agents Fundamentals πΏ β Components, patterns, and agentic behavior
- Agent Memory Systems πΏ β Short-term, long-term, and episodic memory
- Multi-Agent Systems πΏ β Coordination and collaboration patterns
Tool Integration
- Tool Use and Function Calling πΏ β Extending agent capabilities
- Database access, web search, code execution
- Tool composition and caching
Practical Guides
Framework-Specific
- Building Agents with LangChain πΏ β LangChain development guide
- Claude Agent Patterns πΏ β Best practices for Claude
- Agent Frameworks Comparison πΏ β LangChain, AutoGPT, CrewAI, and more
Development Workflow
- Agent Evaluation and Testing πΏ β Testing strategies and benchmarks
- Agent Security Considerations πΏ β Prompt injection, tool safety, auditing
- Production Agent Deployment πΏ β Scaling, monitoring, and operations
Production Considerations
Operations
- Production Agent Deployment πΏ β Deployment architectures, scaling strategies
- Agent Security Considerations πΏ β Security best practices
- Agent Evaluation and Testing πΏ β Performance metrics and monitoring
Cost & Performance
- Token budgets and caching
- Rate limiting and quotas
- Latency optimization
Advanced Topics
Collaboration
- Multi-Agent Systems πΏ β Hierarchical, parallel, and democratic patterns
- Agent communication protocols
- Conflict resolution
Memory & Learning
- Agent Memory Systems πΏ β Vector stores, episodic memory, summarization
- Long-term knowledge retention
- Privacy and forgetting
Project Documentation
Experiments
- eliza-001 β First AI agent experiment: ElizaOS framework exploration
- eliza-002 β Agent capabilities and architecture deep dive
Case Studies
To be added as I build more agents
Learning Path
Beginner (Week 1-2)
- Read AI Agents Fundamentals
- Try Claude Agent Patterns examples
- Build a simple ReAct agent
- Understand Tool Use and Function Calling
Intermediate (Week 3-4)
- Study Agent Frameworks Comparison
- Implement Agent Memory Systems
- Learn Agent Security Considerations
- Build a multi-tool agent
Advanced (Week 5+)
- Explore Multi-Agent Systems
- Master Production Agent Deployment
- Set up Agent Evaluation and Testing
- Deploy production agent
Connection Points
External resources: