> AI Agents Fundamentals

Budding

planted Jan 8, 2026tended Jan 8, 2026

#ai-agents#fundamentals#architecture#autonomous-systems

AI Agents Fundamentals

🌿 Budding note — foundational concepts for autonomous AI systems.

What is an AI Agent?

An AI agent is an autonomous system that:

Perceives its environment through inputs (text, APIs, sensors)
Reasons about what actions to take
Acts on the environment to achieve goals
Learns from feedback to improve over time

Key distinction from chatbots:

Chatbot:    User → LLM → Response
Agent:      User → LLM → [Tools] → Actions → Results → LLM → Response
                    ↑__________________|
                    Autonomous loop

Core Components

1. The Brain (LLM)

The reasoning engine that makes decisions:

# Claude as the agent brain
from anthropic import Anthropic

client = Anthropic()

def agent_think(task: str, context: dict) -> str:
    """Agent reasoning with Claude"""
    response = client.messages.create(
        model="claude-sonnet-4-5-20250929",
        max_tokens=4096,
        system="You are an autonomous agent. Analyze the task and decide what action to take.",
        messages=[{
            "role": "user",
            "content": f"Task: {task}\nContext: {context}\nWhat should I do next?"
        }]
    )
    return response.content[0].text

Popular models for agents:

Claude 3.5 Sonnet: Best for complex reasoning, tool use
GPT-4: Strong general capabilities
Mixtral: Open-source alternative
Gemini Pro: Google's multimodal option

Related: Claude Agent Patterns for Claude-specific best practices

2. Memory Systems

Agents need memory to maintain context and learn:

Short-term memory (conversation buffer):

class AgentMemory:
    def __init__(self, max_messages: int = 10):
        self.messages = []
        self.max_messages = max_messages

    def add(self, role: str, content: str):
        self.messages.append({"role": role, "content": content})
        if len(self.messages) > self.max_messages:
            self.messages.pop(0)  # FIFO

    def get_context(self) -> list:
        return self.messages

Long-term memory (vector database):

from qdrant_client import QdrantClient
from sentence_transformers import SentenceTransformer

class LongTermMemory:
    def __init__(self):
        self.client = QdrantClient(":memory:")
        self.encoder = SentenceTransformer('all-MiniLM-L6-v2')

    def store(self, memory: str, metadata: dict):
        """Store experience in vector DB"""
        vector = self.encoder.encode(memory).tolist()
        self.client.upsert(
            collection_name="memories",
            points=[{
                "id": hash(memory),
                "vector": vector,
                "payload": {"text": memory, **metadata}
            }]
        )

    def recall(self, query: str, limit: int = 5):
        """Retrieve relevant memories"""
        vector = self.encoder.encode(query).tolist()
        results = self.client.search(
            collection_name="memories",
            query_vector=vector,
            limit=limit
        )
        return [hit.payload for hit in results]

Deep dive: Agent Memory Systems

3. Tool Use

Agents extend their capabilities through tools:

Tool definition:

from typing import Callable, Dict, Any

class Tool:
    def __init__(
        self,
        name: str,
        description: str,
        function: Callable,
        parameters: Dict[str, Any]
    ):
        self.name = name
        self.description = description
        self.function = function
        self.parameters = parameters

    def execute(self, **kwargs) -> Any:
        return self.function(**kwargs)

# Example tools
web_search = Tool(
    name="web_search",
    description="Search the web for current information",
    function=lambda query: search_api(query),
    parameters={"query": {"type": "string", "required": True}}
)

calculator = Tool(
    name="calculator",
    description="Perform mathematical calculations",
    function=lambda expression: eval(expression),
    parameters={"expression": {"type": "string", "required": True}}
)

Related: Tool Use & Function Calling

4. Planning & Reasoning

Agents use various strategies to plan actions:

ReAct Pattern (Reasoning + Acting):

Thought: I need to find the current weather in Tokyo
Action: web_search("Tokyo weather today")
Observation: Temperature is 18°C, partly cloudy
Thought: Now I can answer the user's question
Action: respond("The weather in Tokyo is 18°C and partly cloudy")

Chain-of-Thought:

def chain_of_thought_prompt(question: str) -> str:
    return f"""Let's approach this step-by-step:
1. First, identify what information we need
2. Break down the problem into sub-tasks
3. Execute each sub-task
4. Synthesize the results

Question: {question}

Let's begin:"""

Tree of Thoughts (explore multiple reasoning paths):

class ThoughtTree:
    def __init__(self, root_thought: str):
        self.root = {"thought": root_thought, "children": [], "score": 0}

    def expand(self, node: dict, num_branches: int = 3):
        """Generate alternative reasoning paths"""
        for i in range(num_branches):
            child_thought = generate_next_thought(node["thought"])
            child = {
                "thought": child_thought,
                "children": [],
                "score": evaluate_thought(child_thought)
            }
            node["children"].append(child)

    def best_path(self) -> list:
        """Find highest-scoring reasoning path"""
        return traverse_tree(self.root, key=lambda n: n["score"])

Agent Architectures

1. Simple ReAct Agent

Single LLM call with tools:

class ReActAgent:
    def __init__(self, llm, tools: list[Tool]):
        self.llm = llm
        self.tools = {t.name: t for t in tools}

    def run(self, task: str, max_iterations: int = 10):
        """Execute ReAct loop"""
        context = []

        for i in range(max_iterations):
            # Reasoning step
            prompt = self._build_prompt(task, context)
            response = self.llm.generate(prompt)

            # Parse action
            action = self._parse_action(response)
            if action["type"] == "final_answer":
                return action["content"]

            # Execute tool
            tool = self.tools[action["tool"]]
            result = tool.execute(**action["args"])
            context.append({
                "thought": response,
                "action": action,
                "observation": result
            })

        return "Max iterations reached"

2. Multi-Agent Systems

Specialized agents collaborating:

class MultiAgentSystem:
    def __init__(self):
        self.agents = {
            "researcher": ResearchAgent(),
            "writer": WriterAgent(),
            "critic": CriticAgent(),
        }

    async def solve(self, task: str):
        """Collaborative problem solving"""
        # Research phase
        research = await self.agents["researcher"].investigate(task)

        # Writing phase
        draft = await self.agents["writer"].write(research)

        # Review phase
        feedback = await self.agents["critic"].review(draft)

        # Refinement
        final = await self.agents["writer"].revise(draft, feedback)
        return final

Deep dive: Multi-Agent Systems

3. Hierarchical Agents

Manager delegates to specialists:

Manager Agent
    ├── Planning Agent (strategy)
    ├── Execution Agent (actions)
    └── Monitoring Agent (validation)

class HierarchicalAgent:
    def __init__(self):
        self.manager = ManagerAgent()
        self.specialists = {
            "planner": PlanningAgent(),
            "executor": ExecutionAgent(),
            "monitor": MonitorAgent(),
        }

    async def run(self, goal: str):
        # Manager creates plan
        plan = await self.manager.plan(goal)

        # Delegate to specialists
        results = []
        for step in plan.steps:
            specialist = self.specialists[step.type]
            result = await specialist.execute(step)
            results.append(result)

        # Manager synthesizes
        return await self.manager.synthesize(results)

Agent Types

1. Task Completion Agents

Goal: Complete specific tasks (research, data analysis, code generation)

class TaskAgent:
    async def complete_task(self, task_description: str):
        # Understand task
        requirements = await self.analyze_task(task_description)

        # Break into steps
        plan = await self.create_plan(requirements)

        # Execute steps
        for step in plan:
            result = await self.execute_step(step)
            if not self.validate(result):
                await self.revise_plan(step)

        return self.synthesize_results()

Examples:

Research assistant (gather information)
Code generator (write functions)
Data analyst (SQL queries, visualizations)

2. Conversational Agents

Goal: Natural dialogue with memory and personality

class ConversationalAgent:
    def __init__(self, personality: str):
        self.personality = personality
        self.memory = AgentMemory()

    async def chat(self, user_message: str):
        # Add to memory
        self.memory.add("user", user_message)

        # Generate response with personality
        system_prompt = f"You are {self.personality}"
        response = await self.llm.generate(
            system=system_prompt,
            messages=self.memory.get_context()
        )

        self.memory.add("assistant", response)
        return response

Examples:

Customer support bots
Personal assistants
Tutoring systems

3. Autonomous Agents

Goal: Continuous operation toward long-term goals

class AutonomousAgent:
    def __init__(self, goal: str):
        self.goal = goal
        self.state = {}

    async def run_loop(self):
        """Continuous operation"""
        while not self.goal_achieved():
            # Perceive environment
            observations = await self.observe()

            # Update internal state
            self.state = await self.update_state(observations)

            # Decide next action
            action = await self.decide(self.state)

            # Execute action
            result = await self.execute(action)

            # Learn from result
            await self.learn(action, result)

            await asyncio.sleep(self.tick_rate)

Examples:

AutoGPT (recursive task completion)
Cryptocurrency trading bots
DevOps automation agents

Key Concepts

Agentic Behavior

What makes something truly "agentic":

Autonomy: Self-directed action without human intervention
Reactivity: Responds to environment changes
Proactivity: Takes initiative toward goals
Social ability: Interacts with other agents/humans

Grounding

Connecting agent actions to real-world effects:

def grounded_action(action: str, world_model: WorldState):
    """Ensure action has real effect"""
    # Verify pre-conditions
    if not world_model.check_preconditions(action):
        raise InvalidAction("Preconditions not met")

    # Execute with confirmation
    result = execute(action)

    # Verify post-conditions
    new_state = world_model.update(result)
    if not new_state.matches_expected():
        rollback(action)
        raise ActionFailed("Post-conditions not satisfied")

    return new_state

Tool Calling vs Code Execution

Tool calling: Structured function invocation

# LLM returns structured tool call
{"tool": "web_search", "args": {"query": "Python tutorials"}}

Code execution: Generate and run arbitrary code

# LLM generates code
code = """
import requests
data = requests.get('https://api.example.com').json()
result = [item['name'] for item in data if item['active']]
"""
exec(code)  # ⚠️ Security implications!

See: Agent Security Considerations

Evaluation

Measuring agent performance:

Success Metrics

class AgentMetrics:
    def __init__(self):
        self.tasks_completed = 0
        self.tasks_failed = 0
        self.avg_steps = []
        self.tool_usage = defaultdict(int)

    def task_success_rate(self) -> float:
        total = self.tasks_completed + self.tasks_failed
        return self.tasks_completed / total if total > 0 else 0

    def avg_steps_to_completion(self) -> float:
        return sum(self.avg_steps) / len(self.avg_steps)

    def tool_efficiency(self) -> dict:
        """Which tools are most useful?"""
        total_calls = sum(self.tool_usage.values())
        return {
            tool: count / total_calls
            for tool, count in self.tool_usage.items()
        }

Benchmarks

WebArena: Web navigation tasks
SWE-bench: Software engineering tasks
GAIA: General AI assistant tasks
AgentBench: Multi-domain capabilities

Related: Agent Evaluation & Testing

Common Patterns

1. Retry with Refinement

async def retry_with_feedback(agent, task, max_attempts=3):
    """Retry failed tasks with error feedback"""
    for attempt in range(max_attempts):
        try:
            result = await agent.execute(task)
            return result
        except Exception as e:
            if attempt < max_attempts - 1:
                task = f"{task}\n\nPrevious attempt failed: {e}\nPlease try a different approach."
            else:
                raise

2. Human-in-the-Loop

class HumanApprovalAgent:
    async def execute_with_approval(self, action):
        """Require human approval for sensitive actions"""
        if action.is_sensitive():
            print(f"Agent wants to: {action.description}")
            approval = input("Approve? (yes/no): ")
            if approval.lower() != "yes":
                return "Action rejected by human"

        return await action.execute()

3. Self-Reflection

async def self_reflect(agent, task, result):
    """Agent critiques its own work"""
    critique = await agent.llm.generate(f"""
    Task: {task}
    My result: {result}

    Critically evaluate:
    1. Did I fully address the task?
    2. Are there errors or gaps?
    3. How could I improve?
    """)

    if critique.suggests_revision():
        return await agent.revise(result, critique)
    return result

Practical Applications

Customer Support

class SupportAgent:
    async def handle_ticket(self, ticket):
        # Classify issue
        category = await self.classify(ticket.description)

        # Search knowledge base
        kb_results = await self.search_kb(ticket.description)

        # If known solution exists
        if kb_results:
            return self.format_solution(kb_results[0])

        # Otherwise, escalate to human
        return await self.escalate(ticket, reason="No KB match")

Code Assistant

class CodeAgent:
    async def implement_feature(self, spec: str):
        # Generate code
        code = await self.generate_code(spec)

        # Run tests
        test_results = await self.run_tests(code)

        # Fix if tests fail
        while not test_results.passed:
            errors = test_results.errors
            code = await self.fix_code(code, errors)
            test_results = await self.run_tests(code)

        return code

Research Assistant

class ResearchAgent:
    async def research_topic(self, topic: str):
        # Web search
        sources = await self.web_search(topic)

        # Extract information
        facts = []
        for source in sources:
            content = await self.fetch_url(source.url)
            extracted = await self.extract_facts(content, topic)
            facts.extend(extracted)

        # Synthesize report
        report = await self.synthesize(facts)

        # Add citations
        return self.add_citations(report, sources)

Challenges & Limitations

1. Reliability

Agents can fail unpredictably:

Tool calls with wrong arguments
Infinite loops
Context window overflow
Hallucinated actions

Mitigation: Agent Evaluation & Testing

2. Cost

LLM API costs accumulate:

# Track token usage
class CostTracker:
    def __init__(self, cost_per_1k_tokens: float):
        self.cost_per_1k = cost_per_1k_tokens
        self.total_tokens = 0

    def add_usage(self, input_tokens: int, output_tokens: int):
        self.total_tokens += input_tokens + output_tokens

    def total_cost(self) -> float:
        return (self.total_tokens / 1000) * self.cost_per_1k

Mitigation: Production Agent Deployment