
separated by the same feature, just that each has something unique about it that allows you to
break it into roughly equal, easily identifiable groups. One of the most popular clustering
algorithms is K-means, which is a specific instance of a more powerful technique called the E-M
algorithm.
Dimensionality reduction is about manipulating the data to view it under a much simpler
perspective. It is the ML equivalent of the phrase, “Keep it simple, stupid.” For example, by
getting rid of redundant features, we can explain the same data in a lower-dimensional space
and see which features really matter. This simplification also helps in data visualization or
preprocessing for performance efficiency. One of the earliest algorithms is Principle Component
Analysis (PCA), and some newer ones include autoencoders, which we’ll cover in chapter 7.
1.4.3 Reinforcement Learning
Supervised and unsupervised learning seem to suggest that the existence of a teacher is all
or nothing. But, there is a well-studied branch of machine learning where the environment acts
as a teacher, providing hints as opposed to definite answers. The learning system receives
feedback on its actions, with no concrete promise that it's progressing in the right direction,
which might be to solve a maze or accomplish some explicit goal.
Exploration vs. Exploitation is the heart of reinforcement learning
Imagine playing a video-game that you've never seen before. You click buttons on a controller and discover that a
particular combination of strokes gradually increases your score. Brilliant, now you repeatedly exploit this finding in
hopes of beating the high-score. In the back of your mind, you think to yourself that maybe there's a better combination
of button-clicks that you're missing out on. Should you exploit your current best strategy, or risk exploring new options?
Unlike supervised learning, where training data is conveniently labeled by a “teacher,”
reinforcement learning trains on information gathered by observing how the environment
reacts to actions. In other words, reinforcement learning is a type of machine learning that
interacts with the environment to learn which combination of actions yields the most favorable
results. Since we're already anthropomorphizing our algorithm by using the words
“environment” and “action,” scholars typically refer to the system as an autonomous “agent.”
Therefore, this type of machine learning naturally manifests itself into the domain of robotics.
To reason about agents in the environment, we introduce two new concepts: states and
actions. The status of the world frozen at some particular time is called a state. An agent may
perform one of many actions to change the current state. To drive an agent to perform actions,
each state yields a corresponding reward. An agent eventually discovers the expected total
reward of each state, called the value of a state.
Like any other machine learning system, performance improves with more data. In this
case, the data is a history of previous experiences. In reinforcement learning, we do not know
the final cost or reward of a series of actions until it’s executed. These situations render
traditional supervised learning ineffective, because we do not know exactly which action in the
16