Intelligent Agents

Agents

\text{Agent} = \text{Architecture} + \text{Program}

Agent: Anything that can perceive its environment through sensors and acting upon the environment through actuators.¹1. This is a very broad definition that encompasses simple thermostats and complex self-driving cars.

Example: Agent Types

Human agents: Eyes/ears as sensors, hands/legs as actuators.
Robotic agents: Cameras/lidar as sensors, motors/wheels as actuators.
Software agents: Keystrokes/network packets as sensors, screen displays/network requests as actuators.

Terms

Percept: The agent’s perceptual inputs

Percept Sequence: The complete history of everything the agent has perceived

Agent Function: Maps any given percept sequence to an action [f: P^* \to A]²2. P^* represents the set of all possible percept sequences.

Agent Program: Runs on the physical architecture to produce f.

Performance Measure: Objective criterion for success of an agent’s behavior

Should generally measure what you want, rather than how you think an agent should behave.³3. Designing the right performance measure is notoriously difficult—this is known as the “alignment problem” in modern AI safety.

Rationality

An agent should “do the right thing” based on what it can perceive and currently do.

The right action is the one that will cause the agent to be most successful.

Example: Robovac

Which performance measure is better for a robovac?

+1 score for every dirt cleaned within a time unit, or
+1 score for each clean square per time unit

The former measure would reward a robovac that sucks \to dumps \to sucks just to farm points!

Rational Agent

For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure.

Aside: On Omniscience, Learning, and Autonomy

Rationality isn’t omniscience; there is a difference between expected and actual performance. Agents are autonomous if their behavior is determined by experience (they can learn and adapt).

Environmental Properties

PEAS

Performance, Environment, Actuators, Sensors⁴4. Structuring the problem via PEAS prevents feature creep when designing an AI system.

A method for specifying the task environment.

Note — Specifying the task environment is always the first step in designing an agent.

Example: Applying PEAS to an Automated Taxi

Performance Measure	Environment	Actuators	Sensors
safe, fast, legal, comfortably trip, maximize profits	roads, other traffic, pedestrians, customers	steering, accelerator, brake, signal, horn, display	camera, sonar, speedometer, GPS, odometer, engine sensors, keyboard, accelerator

Properties of Task Environments

(Parenthesized text is what the real-world environment tends to be)

Fully Observable (v.s. Partially Observable): Agent’s sensors give it access to complete state.
Deterministic (v.s. Stochastic): Next state determined entirely by current state and action (if only other agents affect it, it’s strategic).
Episodic (v.s. Sequential): Experience divided into atomic episodes; actions don’t affect future episodes.⁵5. Episodic environments are much easier to solve because the agent doesn’t need to think ahead.
Static (v.s. Dynamic): Environment is unchanged while agent is deliberating. (If performance score changes but environment doesn’t: semidynamic).
Discrete (v.s. Continuous): Limited number of distinct percepts/actions.
Single Agent (v.s. Multiagent): Operates alone or with others (competitive or cooperative).

Agent = Architecture + Program

Agent Program v.s. Agent Function

	Input
Agent Program	Takes current percept as input (nothing is available from the environment)
Agent Function	Takes the entire percept history (has memory of all percepts)

Table-Driven Agent

Construct a table that contains the appropriate action for every possible percept sequence.

Why? — Drawbacks: Huge table, long time to construct, no autonomy, takes too long to learn.⁶6. For a game like chess, the table would have more entries than atoms in the universe.

Five Basic Agent Types

(Arranged in order of increasing generality)

1. Simple Reflex Agents

Simply maps “what the world is like now” \to “what action should I do now”.

function SIMPLE-REFLEX-AGENT(percept) returns action
    static: rules // A set of condition-action rules
    state \leftarrow INTERPRET-INPUT(percept)
    rule \leftarrow RULE-MATCH(state, rules)
    action \leftarrow RULE-ACTION[rule]
    return action

2. Model-based Reflex Agents

Maintains internal state to track parts of the world it can’t see now. Uses State, How the world evolves, and What actions do to understand the world.

function REFLEX-AGENT-WITH-STATE(percept) returns action
    static: state, rules, action
    state \leftarrow UPDATE-STATE(state, action, percept)
    rule \leftarrow RULE-MATCH(state, rules)
    action \leftarrow RULE-ACTION[rule]
    return action

3. Goal-based Agents

Maps “What the world is like now” \to “what it will be like if I do action A” \to “what action should I do now” based on pursuing a specific goal.

4. Utility-based Agents

Maps states onto a degree of happiness (utility function) to handle multiple or conflicting goals (e.g., speed vs. safety).

5. Learning Agents

Agents that can operate in initially unknown environments and become more competent.⁷7. Learning agents aren’t really a 5th separate category; you can have a learning utility-based agent, a learning reflex agent, etc. They have four components:

Performance Element: The “agent” part that chooses actions.
Learning Element: Responsible for making improvements.
Critic: Evaluates how well the agent is doing against a fixed performance standard.
Problem Generator: Suggests actions that will lead to new, informative experiences (exploration).

This is a very broad definition that encompasses simple thermostats and complex self-driving cars.↩︎
P^* represents the set of all possible percept sequences.↩︎
Designing the right performance measure is notoriously difficult—this is known as the “alignment problem” in modern AI safety.↩︎
Structuring the problem via PEAS prevents feature creep when designing an AI system.↩︎
Episodic environments are much easier to solve because the agent doesn’t need to think ahead.↩︎
For a game like chess, the table would have more entries than atoms in the universe.↩︎
Learning agents aren’t really a 5th separate category; you can have a learning utility-based agent, a learning reflex agent, etc.↩︎