AI

Intelligent Agents

As a first step towards understanding AI systems, we begin by defining what we mean by an intelligent agent. We adopt the following commonly-accepted notion of intelligent systems:

“A system or program that perceives and interacts with its environment while making informed decisions to achieve a certain goal or maximize some utility.”

Let's break down this definition into its components:

Perception: This is achieved by using sensors that are able to detect the state of the environment. Examples include cameras, microphones, radar sensors, proximity sensors, or simply any setup allowing a program to receive input information in a specific format.

Interaction: Environment interaction is achieved by using actuators that are able to change the state of the environment. In the physical world, one might think of robots or automated vacuum cleaners, whereas in programs, one might imagine a simulated chess move.

Decision Making: An agent should able to produce an action to perform (usually choosing from a number of pre-specified actions), considering the current state of the environment. The decision may be based on some form of implicit scoring or reasoning to determine which action is the best.

Goal: The system must have a goal or objective that it is trying to achieve. Quantifying the objective, and producing a mathematical framework for a computer program to use is often the most significant element of implementing an AI system.

The History of AI

While there is some debate about the true single origin story of AI, we can trace its evolution through a sequence of events, spread over a much shorter timeframe than one might expect. In my view, one of the most important events in the history of AI was the publication of the paper Computing Machinery and Intelligence by Alan Turing in 1950. In this paper, Turing proposed a test for determining whether a machine is intelligent or not, called the Imitation Game. The test, now commonly known as the Turing Test, can be very briefly summarized as follows:

Assume two players, A and B are in separate rooms. Player A is a human, and player B is either a human or a machine. Player C is an interrogator who can ask (typewritten) questions to both players A and B. The interrogator's goal is to determine which of the two players is the human. The machine's goal is to fool the interrogator into thinking that it is the human. If the machine is able to fool the interrogator, then it can be said to be intelligent.

Turing, in his paper, also discusses several paradoxes and objections to such a test - based in theology, denial, mathematics, the theory of consciousness and even extra-sensory perception (telepathy) - making this paper a very interesting read. We shall skip the specifics in these notes for the sake of brevity, but the full paper is linked above. That said, I do want to pose the following questions to you, the reader:

A machine may mimic human thinking in a number of ways - hardcoded sequences, randomness, or some measure of which choices are the best. Which of these would you consider intelligent behavior?
Can a machine be considered intelligent if it is able to fool a human into thinking that it is intelligent?
Could you construct a machine whose workings you couldn't explain?
How do free will and consciousness interact with intelligence?

I hope you find some of these questions particularly challenging to answer; in particular, I hope your own answers do not fully satisfy your curiosity - because then, taking a course in AI will be a meaningful learning experience. But enough with the what-ifs, let's get back to more history. In August 1955, John McCarthy, Marvin Minsky, Nathaniel Rochester and Claude Shannon organized a conference at Dartmouth College, to explore

“... the conjecture that every aspect of learning or any other feature of intelligence can be, in principle, so precisely described that a machine can be made to simulate it.”

This conference is widely considered to be the birth of AI as a field of study, and the term 'artificial intelligence' was coined by McCarthy in the conference proposal. The attendees were quite optimistic about the future of AI, and several predictions were publicly made about how soon a machine could perform tasks that a human could. While the predicted timeframes were off by several decades, AI research has finally caught up to the point where autonomous systems can satisfactorily perform some tasks that humans do, and sometimes even outperform us!

The Vocabulary of AI Environments

Here is a very brief overview of some key terms that we commonly use to categorize the environment based on various factors. For more detail, please refer to Chapter 2 of the primary textbook (AIMA 4th Ed).

Fully v/s Partially Observable Environments: A fully observable environment is one where the agent has complete knowledge of the environment. This may include things like the layout or configuration of the environment, the agent's position, the positions of other objects or agents, etc. A partially observable environment is one where the agent has limited information about the environment. An example of this is an agent using a camera as a perception device being limited by its field of view.

Single-Agent v/s Multi-Agent: In a single-agent system, there is only one autonomous entity or agent that makes decisions and takes actions to achieve a specific goal. In contrast, in a multi-agent system, multiple autonomous entities or agents interact with each other to achieve their individual or collective goals. For example, a robot vacuum is typically a single-agent system whereas a swarm of drones which communicate with each other for collision avoidance or to maintain formation can be considered a multi-agent system.

Deterministic v/s Non-Deterministic: In a deterministic environment, the environment behaves in a predictable manner, which is to say that a particalar action from any given state will always have the same, repeatable outcome. In a non-deterministic environment on the other hand, things happen some degree of randomness or chance associated with them. We typically model non-deterministic environments using probabilistic methods.

Episodic v/s Sequential: In episodic environments, the current state and action have no impact on future states; each timestep behaves as a distinct episode, independent of the past and the future. An example of such an environment would be a robot on as assembly line checking each item for defects. In contrast, a sequential environment is more suited to tasks where there is a temporal dependence of states on chains of actions. Driving, for instance, is highly sequential in nature.

Static v/s Dynamic: A static environment is one that is fixed and unchanging. The configuration of an office space (for the purposes of using a Roomba robot vacuum) is one such example. A dynamic environment is one that changes or evolves with time, such as roads viewed by a self-driving car.

Discrete v/s Continuous: This nomenclature is a bit of a misnomer, since we are really reasoning about whether the action spaces are continuous or discrete. A discrete action space is one where the agent picks one out of a fixed number of actions at each time step (such as playing chess), and an environment that supports only discrete actions is called a discrete environment. A continuous action is one where there is a magnitude associated with various actions, such as the acceleration or braking of a car (neither of which are binary inputs in the real world). Environments that model continuous actions are therefore called continuous environments.

Known v/s Unknown: A known environment refers to whether the agent knows the rules of the environment, rather than the state. Consider the game of solitaire for instance; this is an environment that is only partially observable, since we do not know the order of cards still in the deck, but since we fully know the rules of the game, we say that this is a known environment. An unknown environment is one where we don't know how the environment behaves (driving in Boston is a great example of this).

Problem Formulation

Now that we have a more precise vocabulary at our disposal, let us briefly revisit and expand upon our definition of an intelligent agent, with a focus on how the factors we discussed play a role in modeling various real life problems.

Perception: Recall that perception refers to how an agent 'sees' its environment; the input to any AI model is what we call the percept of the model. This input may be in the form of an explicit model or an internal representation of the entire spate space. Imagine if you will, the game of tic-tac-toe. One could easily construct a computer program to list every possible configuration of a board of tic-tac-toe in a reasonable timeframe, and then use the output as the search space. This approach, however, is limited to deterministic environments, and may not be scalable. Chess, for instance, has more possible board configurations than the number of atoms in the known universe. For agents such as Roombas or self-driving cars operating in non-deterministic environments, the percept is usually a constant feed of the agent's immediate surroundings, using techniques such as computer vision algorithms. For agents such as ChatGPT, the percept is a natural language input.

Interaction: Interaction in the physical world is easily understood as an agent moving or affecting either itself or objects/entities in its environment by means of mechanical actuation. For an intelligent computer program to be able to communicate with the user, interaction often takes the form of text, images, speeech, etc. The output of an image generation model is also a form of interaction.

Learning: Perhaps the aspect of intelligent systems we will spend most of the semester working on, learning is at the very core of artificial intelligence. Learning refers to the process through which an agent may improve or tailor its outputs or interactions based on either data or feedback. Machine learning, reinforcement learning, deep learning, are all aspects of learning that you may be familiar with. Another form that learning may take is for the program to have access to a knowledge base in a format that can be parsed in order to make decisions. Rule-based systems and computations over search spaces are examples of this.

Reasoning and Rationality: The final aspect of decision making refers to whether an intelligent agent is action in a rational manner with an objective in mind. Common considerations for the decision making aspects of an AI model include notions of utility, fairness, and biases that may affect the model's output. Modeling objectives in a manner that satisfies all these aspects at the same time is often impossible, and finding optimal tradeoffs remains an open research problem.