\text{Agent} = \text{Architecture} + \text{Program}
Agent: Anything that can perceive its environment through sensors and acting upon the environment through actuators.11. This is a very broad definition that encompasses simple thermostats and complex self-driving cars.
Percept: The agent’s perceptual inputs
Percept Sequence: The complete history of everything the agent has perceived
Agent Function: Maps any given percept sequence to an action [f: P^* \to A]22. P^* represents the set of all possible percept sequences.
Performance Measure: Objective criterion for success of an agent’s behavior
An agent should “do the right thing” based on what it can perceive and currently do.
Which performance measure is better for a robovac?
The former measure would reward a robovac that sucks \to dumps \to sucks just to farm points!
For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure.
Rationality isn’t omniscience; there is a difference between expected and actual performance. Agents are autonomous if their behavior is determined by experience (they can learn and adapt).
Performance, Environment, Actuators, Sensors44. Structuring the problem via PEAS prevents feature creep when designing an AI system.
Note — Specifying the task environment is always the first step in designing an agent.
| Performance Measure | Environment | Actuators | Sensors |
|---|---|---|---|
| safe, fast, legal, comfortably trip, maximize profits | roads, other traffic, pedestrians, customers | steering, accelerator, brake, signal, horn, display | camera, sonar, speedometer, GPS, odometer, engine sensors, keyboard, accelerator |
(Parenthesized text is what the real-world environment tends to be)
| Input | |
|---|---|
| Agent Program | Takes current percept as input (nothing is available from the environment) |
| Agent Function | Takes the entire percept history (has memory of all percepts) |
Construct a table that contains the appropriate action for every possible percept sequence.
Why? — Drawbacks: Huge table, long time to construct, no autonomy, takes too long to learn.66. For a game like chess, the table would have more entries than atoms in the universe.
(Arranged in order of increasing generality)
Simply maps “what the world is like now” \to “what action should I do now”.
function SIMPLE-REFLEX-AGENT(percept) returns actionstatic: rules // A set of condition-action rulesstate \leftarrow INTERPRET-INPUT(percept)rule \leftarrow RULE-MATCH(state, rules)action \leftarrow RULE-ACTION[rule]return action
Maintains internal state to track parts of the world it can’t see now. Uses State, How the world evolves, and What actions do to understand the world.
function REFLEX-AGENT-WITH-STATE(percept) returns actionstatic: state, rules, actionstate \leftarrow UPDATE-STATE(state, action, percept)rule \leftarrow RULE-MATCH(state, rules)action \leftarrow RULE-ACTION[rule]return action
Maps “What the world is like now” \to “what it will be like if I do action A” \to “what action should I do now” based on pursuing a specific goal.
Maps states onto a degree of happiness (utility function) to handle multiple or conflicting goals (e.g., speed vs. safety).
Agents that can operate in initially unknown environments and become more competent.77. Learning agents aren’t really a 5th separate category; you can have a learning utility-based agent, a learning reflex agent, etc. They have four components:
This is a very broad definition that encompasses simple thermostats and complex self-driving cars.↩︎
P^* represents the set of all possible percept sequences.↩︎
Designing the right performance measure is notoriously difficult—this is known as the “alignment problem” in modern AI safety.↩︎
Structuring the problem via PEAS prevents feature creep when designing an AI system.↩︎
Episodic environments are much easier to solve because the agent doesn’t need to think ahead.↩︎
For a game like chess, the table would have more entries than atoms in the universe.↩︎
Learning agents aren’t really a 5th separate category; you can have a learning utility-based agent, a learning reflex agent, etc.↩︎