As its name suggests, Artificial Intelligence (AI) is the quest to enable machines to become as intelligent as humans. AI has different components such as sensors (cameras, motion sensors, etc.), data storage and transfer as well as mechanics (used in robotics and autonomous cars), however the bulk of AI is software-based and involves algorithms that aim at doing the “right thing” in a given situation or giving the right answers to given questions.
In order to understand what artificial intelligence actually is, we first need to understand what intelligence itself is. Note that other animals have some degree of intelligence as well. Even ants, whose brains contain only 10000 neurons, know how to split a large piece of food and carry it back to the colony with the help of fellow ants.
The following are regarded as components of intelligence:
- Learning from experience.
- Ability to predict the result of an action and find the right action to take.
- Ability to answer questions.
We can therefore have the following concise definition of intelligence. Intelligence is the ability to learn from the past experiences of oneself and others to predict and generalize. For example, when you are playing baseball as a batter, you try to predict where the pitched ball is headed. Then your brain subconsciously tries to predict a suitable way for your arm to move so that the bat hits the ball in the right angle. As you practice and practice, you learn more and more from your experience and your prediction, as well as your action, improves. If you think about it, you see that a similar procedure is going on in other intelligent actions. For example, when guessing the next number is a given sequence, we are again using past knowledge (existing numbers in the sequence) to predict something.
Artificial Intelligence is enabling machines to do the similar things. Note that the answers and predictions AI comes up with, for a given problem, are not necessarily unique as the answers humans come up with for the same problem or question are also not unique. Note also that there is a fundamental difference between human and machine intelligence. Human intelligence is based on electrical signals and neurotransmitters in the brain which work slowly compared to electronic and photonic processes that power computers. There are also fundamental differences between the architecture of the brain and that of the computers. In most computers, the processor and the memory are separate from each other while in the brain memory (synapses) is intertwined with the processor.
Predicting an outcome based on the past experience can be further formalized as follows. We have a source space $latex X$ that contains all the possible situations in our problem and a space $latex Y$ containing all the possible outcomes. We also have some situation-outcome pairs given by our past experience:
$latex (x_1, y_1), (x_2, y_2),\ldots, (x_k, y_k)$
Here each $latex x_i$ is a situation and $latex y_i$ is its outcome. We want to guess a mapping $latex f(x)$ from $latex X$ to $latex Y$, that fits the training data above , i.e. $latex f(x_1)=y_1, f(x_2)=y_2, \ldots, f(x_k)=y_k$ and given situations such as $latex x$ which are not in our past experience set, tells us its outcome as $latex f(x)$. As such, a great part of AI, and the current dominant paradigm of AI called Algorithmic Learning, involves approximating mappings (also called functions), an algorithmic and mathematical quest. For example, $latex X$ can be the set of all pictures of animals and $latex Y$ the set of animal names and we want our function $latex f(x)$ to tell us the name of an animal based on its picture. Note that we convert text, images and speech to sequences and arrays of numbers so that they can be processed by computers. Note also that there are other paradigms of AI, besides Algorithmic Learning, such as Inferential Learning about which I will write later.
There are situations when we want to guess the function $latex f$ without having any of the situation-outcome pairs above. What guides us in such an Unsupervised Learning problem, is the geometry of the space $latex X$, i.e. how its points are situated next to each other. Examples of such problems include Clustering (grouping similar points together) and Dimensionality Reduction (removing useless details from data).
Since the space of all possible functions is huge (actually infinite-dimensional), we often restrict ourselves to a family of functions and different approaches to AI use different distributions of functions to approximate a new one. In linear regression, linear functions are used while in polynomial regression, we use polynomials to approximate our sought-after function $latex f$. Neural Networks, which mimic the working of brain neurons to an extent, use compositions of functions of the form $latex h(\phi(x))$ where $latex \phi$ is a multi-linear function and $latex h$ is a nonlinear threshold function called the activation function. In Instance-based Learning, one hypothesizes that $latex f(x)$ must be an average of $latex f(x_i)$ for $latex x_i$ close to $latex x$. Note also that it is not always the case that the relation between the situations and outcomes is given to us in a clear and deterministic way as in $latex f(x_i)=y_i$. Often there is an amount of uncertainty involved and for this reason we need to take into account situations which have more than one possible outcome, each with a given probability. This leads to probabilistic methods of Machine Learning and function approximation such as Bayesian Learning.
Actually, the type of function related to each problem is determined, to a good extent, by the world we live in. In a similar way that the path of a projectile has a parabolic shape, the set of natural images or natural RNA molecules occupy a small and very specific subspace of the set of all possible images or chains of amino acids. For example, for natural images, the value of each pixel is correlated with pixels around it. From this point of view, a great quest in AI is to find the function types related to different problems. Yes, it is extremely difficult to do AI in a world in which anything is possible, however things change when we take into account the constraints that our world puts on possibilities.
Put differently, we need to find an algorithm that can predict the right type of function, no matter what the problem is. Our brains seem to possess such Master Algorithm that knows the type of function for various problems such as perception (auditory, visual or tactile), logic, higher thinking, etc. Such ability is a result of millions of years of evolution. Our brains have grown with the world we live in and are a part of it.